June 1952 


Vol.8 No.2 
JOURNAL OF THE BIOMETRIC SOCIETY 


A Method of Enumeration of Individual Marrow Elements 
for Research Purposes J. Sharp, E. L. Feinmann and J. F. Wilkinson 


The Problem of Birth Ranks E. S. Keeping 


A Comparison of Litchfield-Wilcoxon and 
Bliss Estimates Frieda F, Eisenberg 


Analysis of Partially Balanced Incomplete Block 
Designs Illustrated on the Simple Square and 
Rectangular Lattices K. R. Nair 


The Computation of Sums of Squares and Products 
on a Desk Calculator J. M. Hammersley 


SS 

= 
f) 


| 


The Biometrie Society 


FouNDED BY THE BIOMETRICS SECTION OF THE AMERICAN STATISTICAL ASSOCIATION 


TABLE OF CONTENTS 


A Method of Enumeration of Individual Marrow Elements for 


Research Purposes 


J. Sarr, E. L. FEINMANN AND J. F. WILKINSON 


The Problem of Birth Ranks .......2.. E. S. KrEepine 


A Comparison of Litchfield-Wilcoxon and Bliss Estimates 


Friepa EISENBERG 


Analysis of Partially Balanced Incomplete Block Designs Illus- 
trated on the Simple Square and Rectangular Lattices 
K. R. Narr 


The Computation of Sums of Squares and Products on a Desk 


Queries 


Abstracts 


The Biometric Society 


105 


112 


120 


122 


156 


Number 2 June 1952 Volume 8 


ES 
| 

| 


Material for Biometrics should be addressed to Miss Gertrude Cox, Institute of 
Statistics, Box 5457, Raleigh, North Carolina, except that authors residing in one of 
the following organized regions can expedite the handling of their papers by sub- 
mitting them to the Assistant Editor for that region. 

British Region: Dr. D. J. Finney, 6 Keble Road, Oxford, England; Australasian 
Region: Dr. E. A. Cornish, University of Adelaide, Adelaide, Australia; French 
Region: Dr. Georges Teissier, Faculte des Sciences de Paris, 1 rue V. Cousin, Paris, 
France 


Material for Queries should go to Professor G. W. Snedecor, Statistical Laboratory, 
Iowa State College, Ames, Iowa. 


Articles to be considered for publication should be submitted in triplicate. 


THE BIOMETRIC SOCIETY 
General Officers 
President, Georges Darmois; Secretary-Treasurer, C. I. Bliss; Council, Maurice H. 
Belz, C. W. Emmens, D. J. Finney, R. A. Fisher, J. W. Hopkins, J. O. Irwin, N. K. 
Jerne, Arthur Linder, P. C. Mahalanobis, Kenneth Mather, Margaret Merrell, 
A. M. Mood, C. R. Rao, P. V. Sukhatme, Georges Teissier, E. B. Wilson. 


Regional Officers 
Eastern North American Region: Vice-President, H. W. Norton; Secretary-Treasurer, 
Walter T. Federer. British Region: Vice-President, Frank Yates; Secretary, D. J. 
Finney; Treasurer, A. R. G. Owen. Western North American Region: Vice-President, 
G. A. Baker; Secretary-Treasurer, W. C. Rollins. Australasian Region: Vice-Presi- 
dent, C. W. Emmens; Secretary-Treasurer, J. A. Keats. French Region: Vice-Presi- 
dent, Georges Teissier; Secretary-Treasurer, Daniel Schwartz. 


Editorial Board 
Biometrics 
Editor: Gertrude M. Cox, Assistant Editors and Committee Members, C. I. Bliss, 
E. A. Cornish, W. J. Dixon, John W. Fertig, D. J. Finney, O. Kempthorne, Horace 
W. Norton, A. M. Mood, H. Fairfield Smith, G. W. Snedecor, Georges Teissier and 
Jane Worcester. Managing Editor: Sarah P. Carroll. 


The Biometric Society is an international society devoted to the mathematical and statistical 
aspects of biology and welcomes to membership biologists, mathematicians, statisticians and others who 
are interested in its objectives. Through its regional organizations the Society sponsors regional and 
local meetings. National secretaries serve the interest of members in Italy, Denmark and the Nether- 
lands and there are many members “‘at large’. Dues in the Society for 1952 for residents of the Western 
Hemisphere are as follows: Full membership including subscription to Biometrics is $7.00. Members 
of the Biometrics Section of the American Statistical Association who subscribe to the journal through 
that organization may become members of The Biometric Society on the payment of $3.00 annual dues. 
For members in other parts of the world, full membership including subscription to Biometrics is $4.50, 
except that members who subscribe to the journal through the American Statistical Association pay 
annual dues of $1.75. Information concerning the Society can be obtained from the Secretary, The 
Biometric Society, Box 1106, New Haven 4, Connecticut, U.S.A. 


Annual subscription rates to non-members are as follows: For American Statistical Association 
Members, $4.00; for subscribers, non-members of either American Statistical Association or The Bio- 
metric Society, $7.00. Subscriptions should be sent to the Managing Editor, Biometrics, P. O. Box 
5457, Raleigh, North Carolina, U.S.A. 


Entered as second-class matter at the Post Office at New Haven, Conn., under 
the Act of March 3, 1879. Additional entry at Richmond, Va. Business Office, 
52 Hillhouse Ave., New Haven, Conn. Biometrics is published quarterly—in March, 
June, September and December. 


as. 
if 
| 
eg 


A METHOD OF ENUMERATION OF INDIVIDUAL 
MARROW ELEMENTS FOR RESEARCH PURPOSES 


J. SHarp, B.Sc., M.B., Ch.B., M.R.C.P., Senton REGISTRAR 
E. L. Ferymann, M.B., Ch.B., M.R.C.P., Mepicat ReacisTrar* 
Joun F. Witxinson, M.Sc., Ph.D., M.D., F.R.C.P., F.R.L.C., PoysicIAN AND 


Dept. of Haematology, Manchester Royal Infirmary 
and Manchester University 


ie any study requiring the assessment of quantitative changes of a 
particular cell type in a suspension of mixed marrow elements, a 
method is essential whereby these cells may be counted with a known 
degree of accuracy. This problem was encountered in experiments 
designed to investigate the possible presence of a factor in pernicious 
anaemia serum affecting the rate of maturation of erythroblasts cultured 
in vitro, as reported by several workers (Rusznyak et al 1947, Lajtha 
1950, Thompson 1950). Our results are presented elsewhere (Feinmann, 
Sharp and Wilkinson, 1952). In these experiments it was necessary to 
count absolute numbers of certain erythroblasts to a known degree of 
accuracy in order to determine whether observed differences in changes 
in their numbers on culture in different media were significant. The 
method of counting is described in detail since, with suitable modifica- 
tion, it has more general application. 

When it is not possible to count directly the number of cells in a 
known volume of fluid, the absolute number of a cell type per unit 
volume is derived by determining its proportion on smears and relating 
this to the total cell count. 

It is well known that in smears prepared on slides, the larger cells 
appear more frequently at the ends and edges of the smear than in the 
centre (Stephens et al 1920, Napier 1922, Gyllenswird 1929, Boveri 1939, 
Dacie 1950). Methods of sampling of the slide, giving reasonably con- 
stant estimates of the proportion of a cell type on smears from one blood 
sample, have been devised (Dacie 1950). Our problem required the 
assessment of proportions of normal and megaloblastic erythroblasts in a 
marrow suspension, initially and after a period of incubation; during 
this incubation there might be not only a change in the relative propor- 


*B.M.A. Scholar. 


105 


106 BIOMETRICS, JUNE 1952 


tions of cells but also maturation of individual cells with a consequent 
decrease in cell size. The degree of maturation would, therefore, influ- 
ence cell distribution and thus counts on slides from a specimen before 
and after incubation taken with a uniform sampling technique could 
not be strictly comparable. Smears prepared on coverslips were used 
since it is generally considered that the distribution of cells on these is 
less affected by mechanical factors Boveri (1939) and Barnett (1933) has 
given a convincing demonstration that near random distribution of 
blood leucocytes is achieved. 

It is common practice in differential counting to count a large number 
of cells on two or three smears. The number of estimates of the propor- 
tion thus obtained is usually too small to give an adequate assessment 
of the standard deviation and the mean proportion may be inaccurately 
estimated if the variation between smears is large. In the method used 
here a mean proportion with its standard deviation may be derived from 
a count of the same, or possibly a smaller total number of cells, by count- 
ing the cells in smaller groups on different smears. Further, there is an 
increased chance of such an estimate approximating more closely to the 
true mean of the population since a larger number of drops of cell sus- 
pension is examined; any one drop may be an inadequate sample of the 
whole suspension. 

In the method described below, the proportion of certain erythro- 
blasts in the total nucleated cells is estimated in two stages. In the first 
stage, the proportion of these erythroblasts in the total erythroblast 
population is ascertained and, in the second, the proportion of all types 
of erythroblasts in the total nucleated cells. The use of two stages may 
appear cumbersome and unnecessary. The accuracy of an estimate of 
the proportion of a cell type is related to the number of these cells 
counted. A direct estimate, with the same accuracy, of the proportion 
of these erythroblasts in all other nucleated cells would require the count- 
ing of a considerably larger number of cells. When there was a low 
proportion of the particular erythroblasts it was possible, by the two 
stage method, to count larger numbers of these cells without counting 
unnecessarily large numbers of cells in which we were not interested. 

In any method involving such highly subjective assessments as slight 
changes in cellular appearances, differences in interpretation by different 
observers might seriously affect the significance of results. A series of 
39 duplicate counts made on the same preparations by two of us (E.L.F. 
and J.S.) indicated no significant difference in recognition of normoblasts 
and “typical” megaloblasts. There was some discrepancy in assessing 
the earliest recognisable ‘‘megaloblastic’”’ change in a proerythroblast. 
Therefore, to reduce the observer error in the data proerythroblasts and 


2. 

ry 

ie 

5. 


INDIVIDUAL MARROW ELEMENTS 107 


megaloblasts were considered together as one group, hereafter referred 
toas “P + M”. The nomenclature adopted and the diagnostic criteria 
of the various cell types are those advocated by Israéls (1948). 

As is described later, the mean for a specimen was derived from a 
series of counts. In each case approximately equal numbers of counts in 
each series were performed independently by two of us (E.L.F. and JS.). 
For counting, each smear was allotted a random number so that the 
observer was not aware from which specimen the smear originated. 


Material. 


1. B.S.S. white blood cell pipettes. 

2. Tiirk’s solution as diluent for counts of total nucleated cells per count 
volume. 

3. “Bright Line” Spencer counting chambers with improved Neubauer 
ruling. 

4. Smears prepared on coverslips from concentrated suspensions of 
marrow cells in mixtures of serum and Gey’s salt solution (Gey & Gey, 
1936) as detailed elsewhere (Feinmann, Sharp and Wilkinson, 1952). 
They were stained by Jenner-Giemsa stains using Sérenson’s phos- 
phate buffer, pH 6.4. 


The staining properties of the cells on smears from different experi- 
ments were found to vary. It was necessary to adjust the concentration 
of the Giemsa stain and its time of contact with the smears by trial and 
error for the smears from each experiment. The smears were fixed for 

3 minutes in Jenner’s stain and then stained for 1 minute in equal parts 
of Jenner stain and buffer. They were then stained in Giemsa’s stain 
diluted with buffer for periods varying between about 33 and 43 minutes. 
The proportion of Giemsa’s stain to buffer averaged about 1 to 10 by 
volume. After drying on blotting paper the coverslips were mounted on 
glass slides using ““Xam” (Gurr). 

Scrupulous cleanliness of coverslips was found to be essential for the 
preparation of satisfactory smears. The best results were obtained with 
coverslips boiled for 20 minutes in equal parts of concentrated nitric and 
hydrochloric acids, thoroughly washed in tap water and at least six 
changes of glass distilled water in which they were stored. They were 
polished shortly before use with the device described by Cameron (1950). 
Cardboard box lids with numbered slits were found useful as holders for 
the coverslips which, for staining, were transferred to slits in a metal strip 
arranged to hook over the slides of a dish containing stain so that the 
coverslips were immersed. Large numbers of smears may be rapidly 
stained by this means. 


108 BIOMETRICS, JUNE 1952 


Principle of the method. 


A cell type, “‘X’’, forms a proportion of a total cell population, ‘7”’. 

If < be the mean of estimates of the proportion of “X” cells and ox 
its Standard Deviation and 7 be the mean value of estimates of “7”’ 
and oT its Standard Deviation it can be shown that: 

The Standard Deviation of the product, i.e. of the number of ‘‘X” 
cells in absolute terms 


= Vor - oF +67 (i) 


Application of the principle. 


(a) The nucleated cell count per unit volume. 

This was performed by the usual technique for leucocyte counting. 
Three pipettes and three chambers were loaded from each specimen 
which was contained in a watch glass treated with liquid silicone to pre- 
vent adherence of cells to the glass. A minimum total number of 200 
cells was counted. In comparative counts on one specimen the same 
pipettes and chambers were used on each occasion. 

The error in the nucleated cell count has been fully investigated by 
Berkson et al (1939), and more recently by Chamberlain and Turner 
(1951). The former workers obtained the following formula for the 
coefficient of variation of the leucocyte count: 


100° 
vr = + + where 


VT = the coefficient of variation of the count, 


n = the number of cells counted, 
nm. = the number of chambers used, 
nm, = the number of pipettes used. 


Chamberlain and Turner, analysing the results of routine blood 
counts of employees at the Atomic Energy Research Establishment, 
Harwell, obtained a combined coefficient of variation for errors due to 
variation in pipettes and chambers and to variation in composition of 
consecutive drops of blood from a finger prick of 8.31%. In our experi- 
ments, and those of Berkson et al, the latter source of variation was 
avoided. The coefficient of variation attributable to pipettes and cham- 
bers calculated from the formula is 6.58%. As Chamberlain and Turner 
point out, rather greater variation was to be expected in their experiment 
where the counts were performed by technicians in the course of routine 


INDIVIDUAL MARROW ELEMENTS 109 


work. We considered it justifiable, therefore, to employ the formula of 
Berkson et al. Counting at least 200 cells and using 3 pipettes and 3 
chambers the coefficient of variation should not exceed 8%. 

(b) The absolute number of P + M was estimated as a proportion 
of the total nucleated cell count. As stated previously, this was done 
in two stages. 

In the first stage the percentage of erythroblasts which were P + M 
was determined. The number of P + M in a count of 100 erythroblasts 
was determined on a minimum of each of 8 separate smears for each 
specimen. In the great majority of cases 10 or more determinations 
were made although very occasionally it was necessary to use one smear 
for more than one count. The mean percentage of erythroblasts which 
were P + M and the standard deviation of this mean for the series of 
determinations were calculated. 

In the second stage, the mean percentage of total nucleated cells 
which were erythroblasts was determined in a similar manner and its 
standard deviation calculated. 

One hundred cells was chosen as a “‘unit’’ since preliminary counts of 
400 and 100 cells on the same smears from one specimen showed no 
decrease in the standard deviation of the mean for the larger unit, 
indicating that the primary source of variation was between units. 


Size of “unit” 400 cells 100 cells 
Mean percentage Erythroblasts 35.1 29.5 
Numbers of “units”? counted 12 11 
Standard Deviation 11.4 10.1 


(ec) The product: 

The mean percentage of erythroblasts which are P + M X the mean 
percentage of total nucleated cells which are erythroblasts = the mean 
percentage of total nucleated cells which are P + M. 

Application of Formula (1) above gives the standard deviation of 
this product. 

Repetition of this process using this product and the total nucleated 
cell count per unit volume gives the estimate of P + M per unit volume 
in absolute numbers, with its standard deviation. 


Discussion. 


The method described is laborious. Great saving in the amount of 
counting labour may be achieved by use of the technique of “Balanced 


110 BIOMETRICS, JUNE 1952 


Sampling” devised by Woolf (1950). ‘This technique is designed for the 
estimation of the proportion of relatively sparsely occurring elements on 
blood slides. In brief, the number of these sparsely occurring elements to 
be counted is chosen appropriate to the degree of accuracy required in 
the count. The ratio of the number of microscope fields traversed in 
counting the chosen number of the sparsely occurring and the same 
number of frequently occurring elements represents their relative pro- 
portions. A somewhat similar technique employing a disc in the ocular 
lens system of the microscope graduated so that the microscope field is 
divided into large and small areas of known proportion has been 
described by Schneidermann and Brecher (1950). Balanced Sampling, 
however, presumes a random distribution of cells on the smear; since 
occasional clumps of four or more leucocytes were observed on some of 
our smears, it was thought that this might seriously affect the accuracy of 
estimates of proportions by this method. For this reason, the usual 
method of direct counting was employed in this study but the opportun- 
ity was taken to compare the results of counts by the direct and ‘‘Bal- 
anced Sampling” techniques on our smears. 

Twelve counts of the percentage of erythroblasts of the total nu- 
cleated cells on different smears from one marrow suspension were per- 
formed in the usual manner counting a total of 400 cells in each case. 
The observed number of erythroblasts on each smear was made the 
basis of a count by balanced sampling technique. The mean percentage 
by direct counting was 33.9 with a standard deviation of 11.2 and by 
balanced sampling, 33.5 with a standard deviation of 10.6. 

This result would appear to indicate that if on otherwise technically 
satisfactory coverslip smears clumping of cells is not marked, balanced 
sampling will give accurate estimates of cell proportions. Thus, the 
proportion of P + M of all other nucleated cells could be estimated 
directly and applied to the total nucleated cell count to derive an abso- 
lute figure, thereby appreciably diminishing the counting labour and 
avoiding much tedious computation. A further advantage of such a 
direct estimate is that a predictable degree of accuracy can be attained 
according to the number of cells counted. 


Summary. 


1. A method is described whereby the count per unit volume of a 
single marrow element with its standard error may be estimated. 
2. Use of the Woolf’s “Balanced Sampling” technique in conjunction 


with this method appears valid so long as theve i is no marked clumping of 
the cells on coverslip smears. 


INDIVIDUAL MARROW ELEMENTS 


Acknowledgements. 


We wish to thank Mr. A. Duval for deriving the formula for the 
standard error of the product of two independent variables, and Pro- 
fessor Bartlett and Miss N. Goodman for advice. 

The work was supported by a grant from the Medical Research 
Council and was carried out during the tenure by one of us (E.L.F.) of a 
B.M.A. Scholarship. 


REFERENCES 


Barnett, C. W., J. Lab. Clin. Med., 12, 77, 1933. 

Berkson, J., Magath, T. V., Hurn, M., Amer. J. Physiol., 128, 309, 1939. 

Boveri, R. M., Guy’s Hosp. Rep., 89, 112, 1939. 

Cameron, G., Tissue Culture Technique. New York: Academic Press Ine., 2nd Edn., 
p. 15, 1950. 

Chamberlain, A. C. and Turner, F. M., Statistical Review of Haematological Techniques 
and Records, Atomic Energy Research Establishment Report, MED/R 708, 1951. 

Dacie, J. V., Practical Haematology, Churchill, London, 1950. 

Feinmann, E. L., Sharp, J., Wilkinson, J. F., Brit. med. J., in the press. 

Gey, G. O., and Gey, M. K., Amer. J. Cancer, 27, 45, 1936. 

Gyllenswiird, C., Acta. Paediat., 8, Supp. 2, i, 1929. 

Israéls, M. C. G., An Allas of Bone-Marrow Pathology, Heinemann, London, 1948. 

Lajtha, L. G., Clinical Science, 9, 287, 1950. 

Napier, I. E., Ind. Med. Gaz., 57, 176, 1922. 

Rusznyak, Lowinger and Lajtha, L. G., Nature, 160, 757, 1947. 

Schneidermann, M. and Brecher, G., Biometrics, 6, 390, 1950. 

Stephens, J. W. W. et al., Ann. Trop. Med., 14, 371, 1920. 

Thompson, R. B., Clinical Science, 9, 281, 1950. 

Woolf, B., Edin. Med. J., 57, 536, 1950. 


THE PROBLEM OF BIRTH RANKS 


E. S. Keeping 
University of Alberta 


A CHARACTERISTIC C is observed in the.r-th member (in order of birth) 
of a family of children of size s. The question arises whether C 
tends to be associated with earlier (or later) births rather than to be 
randomly distributed over all values of r from 1 to s. The problem was 
brought to my attention by Dr. Margaret Thompson, of the University 
of Alberta, who is interested in the occurrence of celiac children in human 
families, and who observed a significant departure from randomness in 
the sample studied (1). 

This paper discusses various tests of significance which have been 
suggested for the birth-order effect, and proposes a simplification of the 
exact test when the characteristic C rarely occurs more than once in the 
same sibship. The limitations of the usual x’ test are also considered. 

Let N, be the observed number of affected individuals (possessing 
characteristic C) who are members of sibships of size s, (s = 1, 2 --- k), 
and of these let N,, have birth rank r(r = 1, 2 --- s). The quantity 
X= a rN,, may be called the total birth-rank, or the score, and used 
to estimate the departure of a sample from randomness, provided the 
distribution of X is known. 

It is assumed that the N, are fixed numbers, determined by the 
particular sample investigated. In each sibship, every possible order 
of normal and affected sfbs is on the null hypothesis equally likely. 
The cumulants of the exact distribution, in these circumstances, of the 
score for a single sibship have been calculated by Haldane and Smith (2). 
They showed that if a is the number of affected sibs in a sibship of size s, 
and if R is the total birth-rank for these affected children, the expectation 
and variance of R are given by 


E(R) = 3a(s + 1), (1) 


V(R) = als + — a)/12, (2) 
112 


BIRTH RANKS 113 
on the assumptions that there are no cases of twins among the affected 
and that there are no miscarriages, stillbirths, or children about which 
information is lacking. 

If the information is imperfect, Haldane and Smith suggest counting 
only the s sibs which are definitely known to be affected or not and which 
have definite birth ranks. If a of these are affected, and have total 
birth-rank R, 


E(R) = a8,/s, (3) 
V(R) = als — a)(sS, — S})/[s*(s — 1)], (4) 


where S, is the sum of the r-th powers of the birth-ranks for all the 
s sibs. Thus for a sibship —NN — A, where N stands for a normal 
child, A for an affected one, and — for a miscarriage or stillbirth, we 
haves = 3,a = 1,8, =2+3+4+5 = 10, 8, = 4+9+ 25 = 88, 
so that E(R) = 10/3 and V(R) = 14/9. 

The practical procedure advocated by Haldane and Smith is to add 
the expectations and variances for all the separate sibships (which are 
of course independent) and to add also the total birth-ranks for the 
affected children. The distribution of }> R for a fairly large number of 
sibships may be expected to be approximately normal. Fractions are 
avoided by computing 62 instead of R. 

For the original data on celiac children in 100 sibships (kindly sup- 
plied to me by Dr. Margaret Thompson), I find >> (6R) = 1464, with 
an expectation of 1262 and a variance of 2096. The standard deviate 
is therefore 4.41, which is obviously highly significant. 

The conclusion is that the celiac condition tends to occur with later 
births in multiple sibships rather than to be distributed at random over 
the possible birth-orders. It may be suggested that this is simply due 
to a deliberate limitation of family size, once an affected child has been 
born into the family. The celiac condition, which is manifested mainly 
by difficulty in digesting fats, is not however, sufficiently alarming to 
make such deliberate limitation at all likely, and in any case the ab- 
normality is not normally recognized until the child is about nine months 
old or more. Another possible explanation would be that the data were 
obtained from a clinic, where the young children presented for treatment 
would necessarily be at the time the last in the family. This is also ruled 
out, since all the families concerned have been followed up for four or 
five years. 

If the incidence of the abnormality C is sufficiently rare that few 


114 BIOMETRICS, JUNE 1952 


cases of more than one in a family occur, the calculation may be simpli- 
fied. In this case, a = 1 in (1) and (2), and we have 


E(R) = (s + 1)/2, (5) 


V(R) = — 19/12. (6) 


The distribution of N,, is then multinomial, with expectation N,/s. 
Moreover we shall not be seriously in error if we reckon all the mis- 
carriages etc, as normal (i.e. not affected). Then if X, is the total birth- 
rank for affected sibs in a sibship of size s, X = > X, , and 


E(X.) = N.G + 1/2, (7) 


V(X.) = — 1/12, (8) 


since all the N, individuals belonging to sibships of size s have the same 
theoretical distribution of birth-rank. The cumulants for this distribu- 
tion have been explicitly given by Stuart (3), although they form a 
particular case of those given by Haldane and Smith (2). 

In the data supplied on celiac children there were 9 sibships out of 
100 with 2 affected sibs and 3 sibships with 3. There were 3 cases of 
twins. These may be dealt with by allocating an affected twin half to 
the birth-rank of the first member of the pair and half to that of the 
second. Thus, in the sibship NN(A = N), the affected twin has a 
birth-rank of either 3 or 4, and contributes } to the frequency of rank 3 
and 3 to the frequency of rank 4. With this procedure the data may be 
summarized as in Table I, which is the same, except for a slight correc- 
tion, as Table 3 of Thompson (Ref. 1). The observed total birth-rank 
from this table is X = 253.5. Since the expectation is 214.5 and the 
variance 83.1, the standardized variate is 4.28, which confirms the high 
significance given by the Haldane and Smith test. 

This form of the test is much quicker to apply, but rests on the 
assumption that cases of two or more affected sibs in the same sibship 
are relatively rare, so that an affected sib may be assumed on the null 
hypothesis to have an equal probability of any birth-rank r from 1 to s. 

In Dr. Thompson’s paper (1) the significance was judged by a chi- 
square test, similar to one suggested by Penrose (4). For each size of 
sibship, from s = 2 on, the number of affected sibs belonging to “later’’ 
birth-ranks (r > s/2) was calculated. These numbers are given by 
totalling the frequencies below the stepped line running across Table I, 
and are denoted by N!. The expected values on the assumption of 
equal probability are denoted by E(N/). A x’ text with 1 degree of 
freedom, based on the observed and expected totals of “later” and 


BIRTH RANKS 115 
“non-later” birth-ranks, gives x” = 10.6, corresponding to a probability 
P = 0.001. 

If, as Penrose suggests, the observed and expected frequencies in each 
birth-rank are compared (see the last two columns of Table I), a x’ 
test with 4 d.f. (the ranks from 5 on being grouped) gives x” = 11.5, 
P = 0.022. An alternative form of the x’ test is to compare in each 
size of sibship, the observed frequencies NV,, with the expected values 
N,/s, this distribution being approximately x’ with s — 1 df. if N, is 
sufficiently large. Adding for the different columns, we get a total x’ 
of 42.3 with 28 d.f., corresponding to P = 0.04. The numbers, however, 
are rather small in several of the columns, and if we omit the columns 
s = 5,7, 8, we have x” = 19.1 for 11 d.f., with P = 0.059. © 

It appears therefore that a x” test made in this way seriously under- 
estimates the significance of the effect. The x” test is well-known to be 
relatively insensitive, since it lumps together all possible types of varia- 
tion from expectation and not merely those due to a positive regression 
of frequency on birth rank. IF. Yates (5) has indicated a technique of 
testing the significance of such regression, and this technique, in a 
slightly modified form, is applicable to the present problem. 

The difference y, between the observed relative score (X,/N,) and 
the expected value (s + 1)/2 may be regarded as randomly distributed 
about zero, for each value of s, on the null hypothesis that the proba- 
bility of characteristic C is independent of birth-rank. If the probability 
of C increases with birth-rank, y, will have a positive regression on s. 
If, then, we calculate the observed regression of y, on s, weighted in- 
versely as the variance, the coefficient of regression b may be tested for a 
significant departure from zero. The weight w, is then 12N,/(s” — 1). 

If we write ¢ = s — 1, the first column of Table I, corresponding to 
t = 0, gives y, = 0, identically. The regression is therefore of the type 
y = bt, where 


b= wty/ > wl (9) 


and 
= (10) 


the sums being over values of ¢ from 1 to 7. From the data, >> wly = 
88.36, and >> wi? = 530.1 so that 
b = 0.167, Var(b) = 0.00189. 


The value of b divided by its standard error is 3.85, showing once 
more a highly significant dependence of C on birth-rank. 


BIOMETRICS, JUNE 1952 


TABLE I 
BIRTH-RANK OF CELIAC CHILDREN 


Birth-rank (r) Sibship size (s) ; Totals 


Expectation 
1 2 3 4 5 6 7 8 
1 |22 17.5 4 3 0 0 0 0 46.5 56.13 
2 20.5 7 3 0 0 ee 30.5 34.13 
3 16 6 0 0 22 15.13 
4 6 1 1 | 0 8 6.13 
5 1 1 2 1.63 
6 3 1 0 4 1.23 
8 3.50 
7 0 2 2 0.39 
8 0 0 0.25 
Nz | 22 38 27 18 2 5 1 2 115 115.03 
Xe | 22 58.5 66 51 9 27 6 14 253.5 
E(X,) | 22 57 54 45 6 175 4 9 214.5 
Ns | 0 20.5 16 12 2 5 1 2 58.5 
E(N,’) 0 19 9 9 0.80 25 0.43 1 41.73 
| .0395 :1.5 19 20 2.5 
we | — 152 40.5 14.4 1.0 1.714 0.25 0.381 
ee 1 2 3 4 5 6 7 


The underlying assumption, both in the test of the total score X and 
in the test of significance for regression, is that X, is normally distributed. 
Haldane and Smith (2) showed that in a large single sibship the sum of 
birth-ranks is nearly normally distributed when the number of affected 
sibs is three or more. If, as in the data here considered, a is seldom 
greater than one the distribution in a single sibship is rectangular, but 
if all sibships of the same size are added together, the distribution of 
N, rapidly approaches normality. It is shown in the Appendix that 
even for quite small values of N, (e.g. 5) the approximation is good, and 
is not very far out even for NV, = 2. 

The conclusion is that the x’ test as commonly applied is not a 
satisfactory test of significance. There is an exact test, developed by 
Haldane and Smith, and this indicates a very highly significant effect 
of birth-order in the data discussed (celiac children). When few cases 


BIRTH RANKS 117 


of more than one affected child occur in a single sibship, the total birth- 
rank X = )~ X, may be compared with expectation, and the significance 
tested, by means of equations (7) and (8) and the assumption of a 
normal distribution. An alternative form of test, which gives similar 
results, is to calculate a simple weighted regression y = bt, where 
y = X,/N, — (s + 1)/2 andt = s — 1, the weights being equal to 
12N,/(s’ — 1). On the null hypothesis b would be zero, and the sig- 
nificance of b may be tested by comparing it with its standard error, 
as given by equations (9) and (10). Either of these tests is fairly simple 
to apply. 


APPENDIX 


The characteristic function for the score X,(= >>, rN,,) is 


Pisin (su/2) 
Cu) = [: sin (u/2) | 


the origin being at (s + 1)N,/2. 
The distribution function for X, is therefore 


F(x) = Pr[X, — (8 + I)N,/2 2] 


If 
y = [X, — + — 


i.e. the standardized variable corresponding to X, , the distribution 
function for y is 


Ns 


sin 
F(a) = 5+ [ (11) 


where o” = N,(s’ — 1)/12. The actual distribution of X, is discon- 
tinuous, but equation (11) gives the distribution of a continuous variable 
with the same cumulants. 

An attempt to estimate the deviation from the normal law for mod- 
erate values of N, may be made by using an Edgeworth series (6). This 
series in fact arises when the integral in (11) is evaluated in powers of 
1/N,. The result, as far as the term in 1/N?, is 


cw au 


TABLE II 


BIOMETRICS, JUNE 1952 


(a) Values of F(z) for different s and for N, = 5. 


2 4 6 8 
F(1.9) 0.9726 .9722 .9722 .9722 
F(2.0) 0.9793 .9786 .9785 .9785 

(b) Values of F(x) for s = 4 and for different N, . 

2 5 10 15 
F(1.9) .974 .9722 .9717 .9716 
F(2.0) .981 0786 .9779 .9777 


(1.9) = 0.9713, (2.0) = 0.9772. 


(c) Values for s = 4, N, = 2. 


118 
8 
x P F(z) #(z) 
0 0.500 0.500 0.500 
0.632 0.719 0.720 0.737 
1.265 0.875 0.886 0.897 
1.897 0.938 0.974 0.971 


BIRTH RANKS 


1 1 (3) 
120n,% 


Fa) = — 


1 +1 65) + 1)" ] 
where = (2m) and (z) is its r-th derivative. The deriva- 
tives of ¢(x) are given in Glover’s Tables (7). They can, of course, be 
expressed in terms of ¢(x) by means of Hermite polynomials. Thus, 


(a) = —(2* — 
= — 10x* + 152)¢(2), (13) 
o'(x) = —(x" — + 1052* — 


If we calculate from equation (12) the values of F(x) in the neighbor- 
hood of x = 2 for s = 2 to s = 8, and for assumed values of N, , it 
becomes evident that the 5% significance level, corresponding to F(x) = 
0.975, will not be seriously affected by the assumption of normality. 
The values of F(x) are almost independent of s in this part of the 
distribution (see Table IT), and for N, as large as 10 there is little differ- 
ence between F(x) and the normal function (2). 

For N, = 2, of course, the series (12) converges slowly and one 
can hardly expect a very good approximation. In this case, however, 
the integration in (11) can be performed directly, and the exact proba- 
bilities P obtained. Values for s = 4 are given in part (c) of Table IT. 
It is evident that even here, throughout most of the range, the approxi- 
mation F(x) is quite good, although it is somewhat worse than the normal 
out near the tail. 


REFERENCES 


(1) Margaret W. Thompson, Heredity, Maternal Age and Birth Order in the Etiology 
of Celiac Disease, American Journal Human Genetics, 3, 159-166, 1951. 

(2) J. B.S. Haldane and C. A. B. Smith, A Simple Exact Test for Birth Order Effect, 
Ann. Eugenics (London), 14, 117-124, 1947-49. 

(3) A. Stuart, The Cumulants of the first » Natural Numbers, Biometrika, 37, 446, 
1950. 

(4) L. S. Penrose, The Influence of Heredity on Disease, London (Lewis). iv + 78 
(see p. 59). 

(5) F. Yates, The Analysis of Contingency Tables with Groupings based on Quanti- 
tative Characters, Biometrika, 35, 176-181, 1948. 

(6) See, e.g., M. G. Kendall, Advanced Theory of Statistics, 148, London (Griffin), 
1947. 

(7) J. W. Glover, Tables of Applied Mathematics in Finance, Insurance, Statistics, 

392-411, Ann Arbor (Wahr) 1923. 


A COMPARISON OF LITCHFIELD-WILCOXON 
AND BLISS ESTIMATES 


FriIeDA Farman EISENBERG* 


Is study was conducted in order to compare the graphic estimates 

(1) with the computed estimates (2, 3) in evaluating 52 dose-effect 
experiments from our laboratories. The ED; and the 19/20 probability 
interval for the ED;,** were the two statistics of interest. 

The data analyzed represent a wide variety of experimental plan and 
purpose. The experiments had from 2-32 animals or insects at each 
dose and a range of 4-10 doses per experiment. 

The following information was obtained for each experiment previ- 
ously analyzed by the Bliss calculations: the dose-effect data, the line 
visually put through the data when converted to logarithms and probits 
(the expected probit line of Bliss), and the computed LD; and its 19/20 
probability limits in original units. The expected probit line selected 
independently for the Bliss analysis became the expected per cent effect 
line for the Litchfield-Wilcoxon analysis worked out on the same data. 

Figure 1 summarizes the results. The log-log scale was arbitrarily 
chosen as the best way in which to show the amount of agreement be- 
tween the two sets of statistics for this body of data. 


*Formerly biostatistician for the Medical Division at the Army Chemical Center, Maryland. 
**The 19/20 probability interval is the difference between the upper and lower 19/20 probability 
limits for the EDs and is approximately equal to four times the average value for the standard error 
of the 


120 


LITCHFIELD-WILCOXON, BLISS ESTIMATES 


Figure 1 Line A Line B 
1000 
900 
800 = 
700 = 
600 8 
500 
a 
400 7 
: 
= 
200 vA 


Line A Comparison 


bane B 19/20 Probability Interval Comparison 


50 60 60 . 100 200 300 400 500 700 1000 
200 300 400 500 700 1000 Bliss 


It is apparent from Line A that the ZD,;,’s are in very good agree- 
ment. However, the 19/20 probability interval values (Line B) are not 
in close agreement. But 36 out of 52 Litchfield-Wilcoxon values are 
larger than the corresponding Bliss. This suggests that the graphic 
estimate will not underestimate the size of the confidence interval. 


REFERENCES 


(1) Litchfield, J. T. and Wilcoxon, F. A Simplified Method of Evaluating Dose- 
Effect Experiments. The Journal of Pharmacology and Experimental Therapeu- 
tics, 96, 2, 99-113, 1949. 

(2) Bliss, C. I. The Calculation of the Dosage-Mortality Curve. Annals of Applied 
Biology, 22, 134-67, 1935. 

(3) Bliss, C. T. The Determination of the Dosage-Mortality Curve from Small 
Numbers. Quarterly Journal of Pharmacy and Pharmacology, XI, 192-216, 1938. 

(4) Litchfield, J. T. and Fertig, J. W. On a Graphic Solution of the Dosage-Effect 

Curve. Bulletin of the Johns Hopkins Hospital, LX1X, 3, 276-286, 1941, 


121 
= 

Hi 
oo 
10 
100 


ANALYSIS OF PARTIALLY BALANCED INCOMPLETE BLOCK 
DESIGNS ILLUSTRATED ON THE SIMPLE SQUARE 
AND RECTANGULAR LATTICES* 


K. R. Narr 


Forest Research Institute, Dehra Dun, India 
and 
Institute of Statistics, University of North Carolina 


1, INTRODUCTION 


_ statistical analysis appropriate for partially balanced incomplete 
block (p. b. i. b.) designs was discussed by Bose and Nair (1939) 
taking into account only intra-block information. Methods for using 
inter-block, in addition to intra-block, information have been given by 
Nair (1944) and Rao (1947). Although the methods of statistical 
analysis of p.b.i.b. designs discussed by these authors apply to the gen- 
eral case of m associate classes, detailed formulae for the variances of 
estimates of treatment differences and for the efficiency factor of the 
designs were worked out only for the cases m = 2 and 3, as it was 
thought at the time that for practically useful designs m should not 
exceed 3. Accordingly, the many illustrative examples of p.b.i.b. de- 
signs given by them include only cases for which m is either 2 or 3. It 
was shown by Bose and Nair (1939) that the simple and triple square 
lattices were special cases of p.b.i.b. designs for which m = 2. 

The combinatorial conditions satisfied by p.b.i.b. designs as originally 
developed by Bose and Nair (1939) required that the m associate classes 
with respect to each treatment could be distinguished by the \-criterion 
as \; , Ax, *** » Am Were assumed to be'unequal. Nair and Rao (1942) 
found, however, that this assumption of unequal \’s was not necessary. 
By this amendment of the definition of p.b.i.b. designs they were able 
to show that the cubic lattice was a special case of p.b.i.b. design for 
which m = 3. 


*Presented before a joint session of The Biometric Society and the American Statistical Association 
in Boston, Massachusetts on Thursday, December 27, 1951. 


122 


P.B.L.B. DESIGNS 123 


Recently, it has been shown by the present author (1951a) that the 
simple rectangular lattice for p(p — 1) treatments developed by Harsh- 
barger (1947) is an example of a p.b.i.b. design for which m = 4. Since 
only two replications are basically required for this design it is of much 
practical value. The present author (1951b) has constructed a rather 
interesting example of a two-replicate design for 12 treatments in blocks 
of 4 plots which also happens to be a p.b.i.b. design for which m = 4. 

The analyses of square and cubic lattices have been done by Yates 
(1936) by the device of considering them as confounded symmetrical 
factorial designs. In the case of the rectangular lattice this device is not 
available. Harshbarger (1947) and Grundy (1950) have discussed the 
methods of analysis of the simple rectangular lattice. One of the pur- 
poses of this paper is to show that the results obtained by Grundy can 
be directly derived by substitution in the general formulae for analysis 
of p.b.i.b. designs obtained by Nair (1944). For this purpose explicit 
expressions for the case m = 4 will be given here for the first time, and, 
for the sake of completeness the expressions for the cases m = 2 and 3 
will be quoted from earlier papers of Bose, Nair and Rao. 

The original method of analysis of the simple square lattice given by 
Yates has been considerably simplified by Cochran and Cox (1959). 
Their method is widely in use. It may therefore help in the under- 
standing of the method of analysis of a p.b.i.b. design if the analysis is 
illustrated on the familiar simple square lattice. This has been done 
in Sections 5 and 6 of the paper. 


2, COMBINATORIAL CONDITIONS OF P.B.1.B. DESIGNS 
An incomplete block design is said to be partially balanced if 


(i) There are ¢ treatments, arranged in b blocks each containing k 
experimental units with different treatments assigned to each. 

(ii) Each treatment occurs in r blocks. 

(iii) With respect to any treatment the remaining can be divided into m 
groups containing n, , N2., --- , 2, treatments, such that the n, 
treatments of the e-th group occur with the given treatment X, 
times. The treatments of the e-th group are said to be e-th associ- 
ates of the given treatment. The numbers n, , m2, *** , %m 3 
Ai, Ao, *** 5 Am are independent of the treatment with which we 
start. Some of the \’s may be equal. (If all \’s become equal the 
design becomes completely balanced). 

(iv) If the treatment A is an e-th associate of B, then the treatment B is 
also an e-th associate of A. If A and B are e-th associates then the 

number of treatments common to the f-th associates of A and g-th 


BIOMETRICS, JUNE 1952 


associates of B is p;, and is independent of the pair of treatments 
we start with. The following relations exist: 


tr=bk (1) 
nar = r(k — 1) (2) 
1 
> pi. = ny — 1 orn, according ase = f ore ¥ f (3) 
g=1 
Dio = Dor (4) 
NePho = (5) 


3. INTRA-BLOCK ANALYSIS OF P.B.I.B. DESIGNS 


Let x,, be the observed value of the character under study for the 
experimental unit allotted to the u-th treatment in the v-th block. Let 
us postulate that yu is the hypothetical mean, 7, the effect of the u-th 
treatment and 8, the effect of the v-th block and that these effects are 
additive. We then assume 


Lue = t+ iB + (6) 
where «,, is considered to be a random variable following Normal Law 
with zero mean and unknown standard deviation o;, . 

Let 7, stand for the total value of «,, for the u-th treatment; B, 


stand for the total value of x,, for the v-th block and G for the grand 
total of z,,. We then define 


Q, = T,, — Sum of the r block means of u-th treatment (7) 
i.e. Q, is the sum of the r observed values of x for the u-th treatment each 
corrected by its block mean. 
Let m, t. , b, be the estimates of u, 7, , 8, obtained by minimising 


subject to the conditions >> (7,.) = O and >> (8,) = 0. It can be shown 
(see Bose and Nair, 1939) that the estimates ¢, of the treatment effects 
satisfy the ¢ equations 


kQu = r(k — 1)t, — tar — Ao Dy bea — Am De bum 


@=1,2,...8 


where : e t,, stands for the total estimated effects of the n, treatments 
which form the Ist associates of the u-th treatment, 7. ty. corresponding 
quantities for the 2nd associates, etc. 


(9) 


P.B.I.B. DESIGNS 125 


Since >>; Q, = 0, it will be seen that the I. h. s. and r. h. s. of the f 
equations when added up lead to the identity 0 = 0, showing that they 
are not independent. Making use of the arbitrary conditions >i t, = 0 
which may be written in the form 


t+ + =0 (10) 


to eliminate >> ¢,,, from (9), we have 


= {r(k — 1) + Andie + Am — tr + Am — 


By adding up the n, equations corresponding to >> f,, , the n. equations 
corresponding to Zz. tuo +++ and the n,,-, equations corresponding to 
> tu,m-1 80 that their 1. h. s. become 
we get m equations involving the unknowns t, , >> tu 
po tu,m-1 from which the solution of ¢, can be obtained as ratio of two 
determinants of the m-th order. 

As shown by Bose and Nair (1939), the sum of squares due to 
treatments (eliminating block effects) is given by the elegant expression 


1.0. (12) 


u=1 


The sum of squares due to blocks (eliminating treatment effects) can 
be shown to be an expression of the form >>! b,Q* (see Nair, 1941 and 
1944) but to avoid the calculation of b, and Q* can be expressed as 


t 6 t 
(13) 
1 1 
The residual or intra-block error sum of squares is 
t b 
u,o 1 1 


We can therefore prepare the following table of analysis of variance 
which is an adaptation of Table 1 of Nair’s (1944) paper. 

The treatment effects can be tested for significance using the variance 
ratio F = E,/E, . 

To complete the analysis, we must obtain expressions for variances of 
treatment differences. There are m types of such differences depending 
on whether the pairs of treatments compared are Ist, 2nd --- m-th 
associates. 

If V,, V2, +++ , Vm are these variances, the mean variance of the 
3¢(t — 1) comparisons between all pairs of treatments is 


BIOMETRICS, J UNE 1952 


TABLE 1. ANALYSIS OF VARIANCE OF P.B.IL.B. DESIGNS 


Source Degrees 
of of Observed sum of squares Mean 
variation freedom Square 
Blocks lyp_ lym Ey 
(eliminating +7; UB - 
treatments) 
t 
Treatments t-—1 t.Q. E, 
(eliminating 
blocks) 
t b 
Intra-block | tr —b—t+1 2 2 E, 
error Luv k B, 


1 m 


since there are 3/n, comparisons of the first type, én. comparisons of the 
second, etc. 

If the experiment was conducted in r randomized complete blocks, 
the mean variance would be 207/r where o, is the standard deviation 
within blocks of ¢ plots. The efficiency of the design using only intra- 
block information is given by the expression 20;/rV. The efficiency factor 
of the design is given by 20;/rV. In other words, the intra-block effi- 
ciency of the design = (o,/c,)” X efficiency factor. The efficiency 
factor depends only on the combinatorial parameters of the design and 
is less than unity, whereas the values of o, and o, have to be empirically 
determined by actual experimentation in the field. We naturally expect 
that o, will be greater than o, . 

The intra-block efficiency of the p.b.i.b. design will be greater than 
that of the complete block design only if the efficiency factor is greater 
than (¢;/¢,)’. 

For resolvable p.b.i.b. designs, i.e. those for which ¢ is a multiple of 
k and, in addition, the blocks of each replication can be separated so that 
the sum of squares for blocks can be split up into the two components 
(i) between replications and (ii) within replications, with (r — 1) and 
(b — r) degrees of freedom respectively, we can write 


1) 


> 


P.B.I.B. DESIGNS 127 


and 


= 


(17) 


where 1/w and 1/w’ are variances per plot for intra- and inter-block 
comparisons. 


(3.1) Two-associate designs (m = 2). 
The estimated effect of the u-th treatment is 


Qu Bas An» By 
where 
As = (As (19) 


By = (Az Dad 
By = r(k 1) + 2+ (Ae — 


The variance of the difference between estimated effects of two treat- 
ments which are Ist associates is given by 


= 1 By By 2k(Bi> + (20) 
If the two treatments are 2nd associates, the variance is 
0 Bao Aco Bao 
The mean variance is 
= 1 
V= + n2V2) 
(¢-—1) , | Aw Bi 
Ba» Ags 
The efficiency factor of the design is given by 
(t = 1) Ai By (t 1) By | (23) 
Aw Ba Bay 


128 


BIOMETRICS, JUNE 1952 


(3.2) Three-associate designs (m = 3). 
The estimated effect of the u-th treatment is 
Q Bs Cis Ais Bis Cis 
Ba Cal + lan G (24) 
By Ge Cos 
where 
Ais = r(k — 1) +s ) 
Ass = (As — — Pin) — As — d2)Piz 
= (As — Az) (M2 — — (As — 
Bis = (As — Ax) 
Bog = r(k — 1) + As + (As — — Pn) 
+ (As — — Die) (25) 
Bsz = (As — — Piz) + (As — — P22) 
Cis = (As — Az) 
Cos = (As — — Pir) + As — — Piz) 
Css = r(k — 1) + As + (As — — Piz) 
+ (As — — P22); 


Variance V, of difference between two first associates is given by 20; 
times the value of the r. h. s. of (24) when the elements of the first column 
of the determinant in the numerator are replaced by 1, —1, 0 respec- 
tively. 

Similarly V, is given by 20; times the value of the r. h. s. of (24) when 
these elements are replaced by 1, 0, —1 respectively. 

Lastly V; is obtained when these elements are replaced by 1, 0, 0 
respectively. The mean variance is 


= (t (nV, + + V5) 


bial (¢—1) Bs Cis Ais Bis Cis 
“th da Ba Co (26) 
—n. Bs Css 
and the efficiency factor is 


P.B.1.B. DESIGNS 


Ais By; Cis 1) Bis C13 


t-—1 


Ass Bss Css —n, Bz Cs; 
(3.3) Four-associate designs (m = 4). 
The estimated effect of the u-th treatment is 
Q Bu Cu Du Aw Bu Cu Diu 
Bu Cu Da} Aa Ba Cx Da 
re, & Bal Ge 
Bu Ga Bul Co. 


where 


Ay =r(k-—1)4+% ) 


Ags = (Ae — — pir) — (As — — (Aa — 
Ags = (As — Az)(M2 — Pre) — (Aa — Ar)Pi2 — (Aa — 
Ags = (As — — Dis) — (Aa — — (Aa — 
= (Aa — Ad) 
Boy = r(k — 1) + — — 

+ (Aa — Ao)(Di2 — Piz) + (As — As)(pis — Pis) 
= (As — — Piz) + (Aa — — P22) 

+ (As — — prs) 
Bas = (Aa — Mi) (Vis — Dis) + Os — Pas) 

+ (ds — 
Crs = (Ag — Ad) 
Cos = (Aa — — Pr) + Aa — — Pie) 

+ (As — As)(Vis — Dis) 
Con = r(k — 1) +a + As — — Diz) 

+ (As — — Pre) + (Aa — As) — Pas) 
Cas = (Aa — — Dia) + (As — — Dos) 
+ (Xs — — is) 


(27) 


(28) 


(29) 


BIOMETRICS, JUNE 1952 


Dis = (Ng = Xs) 
Dog = — = Pir) 2) (Pie Dir) 
+Q.- As) (pis = Dis) 


Dy = — i) (pie Pir) 2) (Poo = Do») 
As) (Pos Pos) 
Das = — 1) +4 + — — Bis) 


+Q.- do) (Pos Dos) + (Ay As) (Diss Ds) ) 


Variance V, of difference between two first associates, is given by 20; 
times the value of the r. h. s. of (28) when the elements of the first 
column of the determinant in the numerator are replaced by 1, —1, 0, 0 
respectively. 

Similarly V, is given by 2c; times the value of the r. h. s. of (28) when 
these elements are replaced by 1, 0, —1, 0 respectively. 

V; is obtained when these elements are replaced by 1, 0, 0, —1 
respectively. 

Finally, V, is obtained when these elements are replaced by 1, 0, 0, 0 
respectively. 

The mean variance is 


V= (mV, + + + 4 V5) 


(t 1) By Cis Dis Au Bu Cy Dis 
2koi. By Cu Dao Ax Boy Co Das (30) 

(¢ — 1) 

C34 Dag Az, Das 


Bu Cu Das Au Ba Cu Das 


Aw Bu Cus Dis (t 1) Bus Cis Dis 

1)| Acs Bas Cos Dag Bu Co Das 
= + (31) 

Ag Ba Cas Das Buy Cup Das 


4. RECOVERY OF INTER-BLOCK INFORMATION 


The method for recovering inter-block information in the case of 
p.b.i.b. designs was given by the author (1944). Using Q, as defined in 


| 
130 | 
and the efficiency factor is 


P.B.1I.B. DESIGNS 131 


(7) and defining Q/ as the sum of the r block means of the u-th treatment 
minus r times the general mean, the combined intra- and inter-block 
estimate (denoting it by ¢,) of the effect of the u-th treatment was shown 
to be obtainable by solving the ¢ equations 


k(wQ, + w’Qi) = {rc — + {1 - whi 
+ (a, - D tu 


(u = 1, 2, --- 
taking along with them the arbitrary condition 


=0 (33) 


Eliminating >> t, from (32) with the help of (33), we have 
k(wQ, + w’Q!) = [r{(e — + w’} + 
+ Qn — — w') ta 


Comparing (34) with (11) we see that ¢, can be obtained from the corre- 
sponding expression ¢, if we substitute in the latter wQ, + w’Qi for Q, ; 
r{w + [w’/(k — 1)]} for r; and A(w — w’) for A (see Rao, 1947). We 
shall make use of these substitutions while deriving the expressions for 
the cases m = 2, 3 and 4. 

Let V,, V2, +--+ , Vm be the variances of the difference between the 
values of ¢ for two treatments which are Ist, 2nd, --- , m-th associates 
respectively. The mean variance of the differences between all pairs of 
treatments is 


ik = 

The efficiency of the design when inter-block information is also 
recovered from the data is given by 207/rV. When the design is resolv- 
able, this efficiency can be expressed in the form 


BIOMETRICS, JUNE 1952 


of + — 1) 1} 
rk(t — 1)V 


(36) 


(4.1) Two-associate designs (m = 2). 


The combined intra- and inter-block estimate of the effect of the u-th 
treatment is given by 


k (wQ.. + w’Q!) Bi 12 Bi. (37) 
Qu + Qi) Bi, Aj, Bi 
where 
Al, = — w’) + rkw’) 


Br = B,(w — w’) 
Bi — w’) + rkw’) 


By putting w’ = 0 in (37) and (38) we get ¢, given in (18). By putting 
w = 0, we can get the inter-block estimate, if one exists. By putting 
w = w’ the value of ¢, reduces to 1/r(Q, + Q!) = treatment mean minus 
general mean, which may be called the estimate of the treatment effect 
without eliminating block effects from it. 


The variance of the difference between effects of two treatments 
which are Ist associates is 


1 Bi 
Bi, 


, , 
12 Bi 


An Br 


(39) 
_ + Be) 


Similarly for two second associates the variance of the difference is 


i 
0 Bh 


(40) 
(A 


The mean variance is 


P.B.I.B. DESIGNS 133 


1 
V= (mV, + n2V2) 
_ |(t-1) Br Bis (41) 
la 


The efficiency of the resolvable type of two-associate p.b.i.b. design is 
obtained by substituting the r. h. s. of (41) in (36) for V. 


(4.2) Three-associate designs (m = 3). 


The combined intra- and inter-block estimate of the effect of the u-th 
treatment is given by 


(wQ., + w’Q!) Biz Cis Ais Bis Cis 
t= Qu + Qh) Bis + | Als Bis (42) 
(w> Que + Qi.) Bis Aj; Bis; 


where 
Al, = — w’) + rkw’ Cl, = Cis(w — w’) ) 
Als = Ans(w — w’) Ch, = — w’) 
Ajs = As3(w — w’) Chs = — w’) + | (43) 
Bi, = B,;(w — w’) 
Bi, = B.3(w — w’) + rkw’ 
33 = Bs;(w — w’) 


Variance V, of difference between two first associates is twice the 
r. h. s. of (42) when the elements of the first column of the determinant 
in the numerator are replaced by 1, —1, 0 respectively. 

Similarly V, is twice the r. h. s. of (42) when these elements are 
replaced by 1, 0, —1 respectively. 

Finally V, is twice the r. h. s. of (42) when these elements are re- 
placed by 1, 0, 0 respectively. 

The mean variance is 

V= (n.Vi + + 
(t 1) Bis Cis Aj; Bis 13 
2k 


Bi Cy|+| Ais Bis Crs (44) 
By, Aj; Bis 


| 
: 
‘ 


134 BIOMETRICS, JUNE 1952 


The efficiency of the resolvable type of three-associate p.b.i.b. design 
is obtained by substituting the r. h. s. of (44) in (36) for V. 


(4.3) Four-associate designs (m = 4). 
The combined intra- and inter-block estimate of the effect of the u-th 

treatment is given by 

wQ, + w’'Q! Bi, Ci, Di, Bi. Cle Di. 

Que + Qin Bis Dis As, Bi. Ci. Din 

wr Qs + Qs Bir Che Dis 


Al, = — w’) + rkw’ Ch = Culw — w’) 

Ah, = An(w — w’) Cu = Cu(w — w’) 

= Azs(w — w’) Che = — w’) + rkw’ | 
Ai, = Au(w — w’) Cu = Cu(w—w (46) 
Bi, = Bu(w — w’) Dis = Dys(w — w’) 

Bi, = Bu(w — w’) + rkw’ Diy = — w’) 

Bi, = By,(w — w’) Di. = — w’) 

= Bau(w — w’) Dis = Das(w — w’) + rkw’ 


The variance V, of the difference between two first associates is 
twice the r. h. s. of (45) when the elements of the first column of the 
determinant in the numerator are replaced by 1, —1, 0, 0 respectively. 

Y, is twice the r. h. s. of (45) when these elements are replaced by 
1, 0, —1, 0 respectively. 

Y; is twice the r. h. s. of (45) when these elements are replaced by 
1, 0, 0, —1 respectively. 

Finally V, is twice the r. h. s. of (45) when these elements are re- 
placed by 1, 0, 0, 0 respectively. 

The mean variance is 


(mV; + + nV) 


P.B.I.B. DESIGNS 135 


9) 14 14 14 Cis Dis 


2k By Cr Dos Ay, By Dis 
: (47) 


Bye Cy. Aix By Cy. Di 
—n; By Cy Dis At Bu Cy Diy 


The efficiency of the resolvable type of four-associate_p.b.i.b. design 
is obtained by substituting the r. h. s. of (47) in(86) for V. 


5. ANALYSIS OF THE SIMPLE SQUARE LATTICE 


It has been shown by Bose and Nair (1939) that the simple, triple and 
higher order square lattices are p.b.i.b. designs. For illustration, the 
parameters of the simple square lattice are given below assuming that the 
basic two-replicate design is repeated / times. 


k=p, r=2, b=2bp 
=" 
2p-1) m=(p— 1) 


n= (p 1) | 2 
@-N—-2) ap — 2) (p — 2)" 


The p’ treatments are identified by a pair of numbers (h, 7) where h 
and 7 can take values 1, 2, --- , p. The treatment numbers may be 
written as in Fig. 1 below. 


i 
th 1 2 p 
\ 
h \ 
1 (1,1) (1,2) (1,p) 
2 (2,1) (2,2) (2,p) 
(p,1) (p,2) (p,p) 


FIG. 1 


There are | “‘x-replications” in each of which the rows of Fig. 1 are taken 
as blocks and I ‘‘y-replications” in which the columns of Fig. 1 are taken 


| 


136 BIOMETRICS, JUNE 1952 


as blocks. In the field, treatments are randomized within blocks, and 
blocks within replications. 

Let 2,;; and y,;; be the value of the character under observation for 
the j-th “‘x-replication” and ‘‘y-replication”’ respectively of the treatment 
(h, 2). Let and y,;, represent the total value for the / ‘‘z-replications” 
and the 1 “‘y-replications” respectively of the treatment (h, 7). These 
values and their marginal totals are represented in Fig. 2 and Fig. 3 


respectively 
i 
1 2 Total 

1 Tu. Ti2, Zip. 

Pp pp. Zp... 

Total 2.1, 2.2. 


FIG. 2. TREATMENT TOTALS OF THE 1 “zx-REPLICATIONS.” 


1 2 Total 
h \ 

1 Yu. Yip. 

2 Ya. Yon, Yop. 

P Ypi. Yp2. Upp. Yp.. 


FIG. 3. TREATMENT TOTALS OF THE 1 “y-REPLICATIONS.” 


P.B.I.B. DESIGNS 137 


The total for the h-th block of the j-th “x-replication’”’ will be repre- 
sented by 2,.; and of the j-th “y-replication” by y.,; . The total for 
treatment (h, 2) will be represented by = (Xai. + yni.)- 

The Q, and Q/ defined in earlier sections will have to be replaced by 
Q,; and Qj; respectively as each treatment is represented by two numbers 
(h, 7). Let ¢,; stand for the combined intra- and inter-block estimate of 
the effect of treatment (h, 7). 

Value of ¢,; is obtained from the equation 


la; =p i + w’Q;:) 12 12 Bie 
Quit Qi) Br Az, Bi 


where > Qi. and p3' e are the sum of the values of Q and Q’ respec- 
tively for the 2(p — 1) first associates of treatment (h, 7) and 


(48) 


t2 = — 1hw+ w’} ) 
22 = —21(p — 2)(w — w’) 
= — w’) 
Bi, = U(p + 2)w + (p — 2)w’}) 
Hence, 
+ 2)w + (p — 2)w’}(wQ,: + 
2lpw(w + w’) + (w— w’)(wd Qa + w’ Mi) 
Since, 


Onin = Qn. + Q.; 
Ma = + — 2: 
the r. h. s. of (50) can be simplified to 


where 


(w— w') 


To estimate w and w’ we have to perform the analysis of variance as 
in Table 1 which requires calculating the intra-block estimate of treat- 
ment effect, namely, f,; . This is obtained by putting w’ = 0 in (51). 
Thus 


138 BIOMETRICS, JUNE 1952 


1 1 
hi = [ + Pp (Q,. + 0.9 | 
(52) 
1 
21 [Qui + Qu. + 


The Sum of Squares due to treatments (eliminating blocks) given by 
(12) reduces in this case to 


The Sum of Squares due to blocks (eliminating treatments) given by 
(13) reduces in this case to 


Since the design is resolvable, we can eliminate from (54) the part due 
to replications as done in Table 2 of the author’s (1944) paper. 
The Sum of Squares due to replications is 


| + 21 | (55) 


i=1 
Hence the Sum of Squares due to blocks within replications (eliminat- 
ing treatments) is obtained by subtracting (55) from (54). It can be 
split up into two components (a) and (b). Component (a) is 


h=1 j=1 t=1 j=1 


and represents the Sum of Squares between similar blocks of the l 
“z-replications” and the / “y-replications.” It has 2(1 — 1)(p — 1) 
degrees of freedom and disappears when / = 1, i.e. when the two basic 
x and y replications are used without repetition. 

pe | Component (b) is obtained by subtracting the sum of (55) and (56) 
from (54) and reduces to 


triQni + > z.. + > (as. + Yri.) 


(57) 


1 
(2... — y...)” 


P.B.1.B. DESIGNS 139 


This component has 2(p — 1) degrees of freedom. Making use of the 
r. h. s. of (53) and remembering that 


Q:. 


ll 
< 
> 


1 
Pp J 


we can further simplify (57) to the form 


1 1 2 


We may present the analysis of variance as in Table 2. 
According to the author’s (1944) paper the estimates of w and w’ are 
obtained from this table as follows: 


2i— 1 1 
, — 
Hence estimate of y is 


~(w+w) lE,+(l—- DE, 


These estimates of w, w’ and y are used in (51) to get estimates of 
treatment effects after recovery of inter-block information. 
Using (57a) and the relations: 


Qi = (z..; + y.:.) 


Qi. = (60a) 
= Yu. 


Qh. + = PQ: 


| 
| 


TABLE 2 


BIOMETRICS, JUNE 1952 


ANALYSIS OF VARIANCE OF THE SIMPLE SQUARE LATTICE. 


Source of Degrees of Mean 
Variation freedom Sum of Squares Square 
Replications [2] — 1 (55) 
Blocks within 
replications 
(adjusted for {2/(p — 1) (54) — (55) or (56) + (57) or Ey 
treatments) (56) + (58) 
Treatments (53) E, 
(adjusted for 
blocks) 
Intra-block (2lp — p — 1) — 
error (p — 1) + Yiii) (53) 
= gai E. 
1 l D 
P j=1 h=1 
Note: Ifl = 1, (56) disappears so that (54) — (55) = (57) or (58). 
(51) can be simplified to 
1 1 


By putting y = 1 in (61), we get the following simplified form of (52). 


ii = + yi.) + [(Yn.. — tr.) + yd] 


1 
Dip? (x... + 9...) 


(62) 


The adjusted treatment mean with recovery of inter-block informa- 


tion is 


1 
Dip? +9...) +h: 


140 


P.B.I.B. DESIGNS 141 


The adjusted treatment mean without recovery of inter-block infor- 
mation is obtained by putting y = 1 in (68). 


6. NUMERICAL EXAMPLE OF THE SIMPLE SQUARE LATTICE ANALYSED AS A 
PARTIALLY BALANCED INCOMPLETE BLOCK DESIGN. 


The data for this example have been taken from p. 283 of Experi- 
mental Designs by Cochran and Cox. It may be noted that 1 = 1 and 
p = 5 in this example. 

The simplified method of analysis used by Cochran and Cox amounts 
to using (58) for calculating component (b) of the Sum of Squares for 
blocks (adj) and using (63) for calculating the adjusted treatment means. 
They have not calculated the Sum of Squares que to treatments (adj) 
given by (53). 

In order that the example may faithfully illustrate the analysis of a 
general resolvable p.b.i.b. design we shall do the analysis using (54) and 
(55) for calculating the Sum of Squares for blocks (adj) within replica- 
tions and the |. h. s. of (53) for calculating the Sum of Squares for treat- 
ments (adj). We shall also use (51) and (52) instead of their simplified 
versions (61) and (62) for estimating treatment effects with and without 
recovery of inter-block information respectively. The various steps in 
the computation are clear from Tables 3 to 11 given below. 


2-REPLICATION 
TABLE 3.1. VALUES OF ani 


Block Block 
Total Mean 
Block En, 

No. 

1 6 i i 5 8 6 32 6.4 

2 16 12 12 13 8 61 122 

3 17 7 7 9 14 54 10.8 

4 18 16 13 13 14 74 14.8 

5 14 15 11 14 14 68 13.6 

Total (z.;) 71 57 48 57 56 289 = z.. 


y-REPLICATION 


BIOMETRICS, JUNE 1952 


TABLE 3.2. VALUES OF yni 


50 


Block Total 
No — 6 7 8 9 10 (yn.) 
24 21 16 17 15 93 
13 ll 4 10 15 53 
24 14 12 30 22 102 
il 1l 12 9 16 59 
8 23 12 23 19 85 
Block (y.;) 80 80 56 89 87 392 = y,, 
Total 
Block (7,;) 16.0 16.0 11.2 17.8 17.4 
Mean 
28 92 
General Mean = 280 + 302 = 13.62 


TABLE 4. TREATMENT TOTALS (zi: + yai) AND BLOCK MEANS (4. AND y.:) 


Zn 
30 28 21 25 . 21 6.4 
29 23 16 23 23 12.2 
41 21 19 39 36 10.8 
29 27 25 22 30 14.8 
22 38 23 37 33 13.6 
| 16.0 | 16.0 | 11.2 | 17.8 | 17.4 


13.62 = 3 (%,. + 9..) = g.mean. 


TABLE’5. VALUES OF Qai ="(cni + + 


Qn 


142 
Qn. Qu. 
+7.6 |4+5.6 | 43.4 |+0.8 | -2.8 | +14.6 | +2.92 
+0.8 |-5.3 |-7.4 |-7.0 | -6.6 | —25.4 | —5.08 
414.2 | -5.8 | -3.0 | +10.4 | +7.8 | +23.6 | +4.72 
—1.8 |-8.8 |-1.0 | -20.6 | -2.2 | -19.4 | —3.88 
—~7.6 | 48.4 |-1.8 |+5.6 | 42.0 | +6.6 | +1.33 
Q; | +13.2 | -0.8 | -9.8 | -0.8 | -1.8 0 0 
| + 2.64] -0.16 | -1.96 | —0.16 | —0.36 0 


P.B.I.B. DESIGNS 143 


TABLE 6. INTRA-BLOCK ESTIMATE OF TREATMENT EFFECT 


= 31Qni + Qs. + 


+ 6.58 +4.18 +2.18 +1.78 —0.12 
— 0.82 —5.22 —7.22 ~—6,12 —6.02 
+10.78 —0.62 —0.12 +7.48 +6.08 
— 1.52 —3.92 —3.42 —7.32 —3.22 
— 1.8 +4.78 —1.22 +3.38 +1.48 


Treatment sum of squares (adjusted) = >> >> t,,Q,; = 711.120 


TABLE 7. ANALYSIS OF VARIANCE 


Degrees Sum of Mean 
Source of variation of freedom Squares Square 
Replications 1 212.18 
Blocks within rep. (unadjusted) 8 350.00 
Treatments (adjusted) 24 711.12 29.63 E; 
Intra-block error 16 218.48 13.66 E. 
Total 49 1491.78 
Treatments (unadjusted) 24 559.28 
Blocks within rep. (adjusted) 8 501.84 62.73 E, 


Variance ratio, F = fs = 2.17 lies between the 10 and the 5% levels. 


13.66 
2E, — E, 2 X 62.73 — 13.66 


w-w E,—E, 62.73 — 13.66 
62.73 


Estimate of = = 0.12218 


Estimate of y = = 0.78224 


TABLE 8, VALUES OF Q%i = (én. + 9.1) — 2 X GENERAL MEAN 
= (Z,. + yi) — 27.24 


—4.84 —4.84 —9.64 —3.04 —3.44 
+0.96 +0.96 —3.84 +2.76 +2.36 
—0.44 —0.44 —5.24 +1.36 +0.96 
+3.56 +3.56 —1.24 +5.36 +4.96 


+2.36 +2.36 —2.44 +4.16 +3.76 


960°S+ 


= 


BIOMETRICS, JUNE 1952 


G68 


I— 
OFT FI+ 
216°0+ 
600° 2+ 


[ 


SHNIVA ‘6 ATAVL 


144 
S 
BIS | manwn 
BIS | 22285 
eho 
sla | 
© 0 an 
+ + 
|s 
: NrKonala 
7 
19 19 
2222/3 |8 
la 
| 
3/8 
sla t 
+ ig, 


P.B.I.B. DESIGNS 145 


TABLE 10. COMBINED INTRA- AND INTER-BLOCK ESTIMATE OF TREATMENT 
EFFECT 


= (a +2 a.) +2 {(a. (0. 


+5.4480 | +3.3525 | +1.0260 | +1.1485 | —0.7730 
—0.4500 | —4.5455 | —6.8715 | —5.2495 | —5.1710 
+9.9305 | —1.1650 | —0.9910 | +7.1310 | +5.7095 
—0.9970 | —3.0925 | —2.9190 | —6.2965 | —2.2180 
—1.9945 | +4.9100 | —1.4160 | +3.7060 | +1.7845 


TABLE 11. TREATMENT MEAN (ADJUSTED WITH RECOVERY OF INTER-BLOCK 
INFORMATION) = 13.62 + ini 


19.07 16.97 14.65 14.77 12.85 


13.17 9.07 6.75 8.37 8.45 
23.55 12.46 | 12.63 20.75 19.33 
12.62 10.53 10.70 7.32 11.40 


11.63 18.53 12.20 17.33 15.40 


The values given in Table 11, when multiplied by 2, give the adjusted 
treatment totals calculated by Cochran and Cox in the bottom table on 
p. 283 of Experimental Designs. 


7. ANALYSIS OF THE SIMPLE RECTANGULAR LATTICE 


It has been shown by the author (1951a) that the simple rectangular 
lattice for p(p — 1) treatments (p > 4) is a four-associate p.b.i.b. 
design. For the purpose of analysis let us assume that the two-replicate 
design is repeated / times so that the parameters are 


t=p(p—-—1) k=(p-—1) r= b = 2lp 
= 1 A = 0 As = 0 4 = 0 
n, = Ap—2) n= (p—2)(p—3) n 
(p — 3) (p — 3) 1 0 
(p 3)(p 4) (@—3) 0 
| 1 (p — 3) (p— 3) 1 
0 0 1 0 


BIOMETRICS, JUNE 1952 


2 2(p — 4) 2 0] 
— 4) (p — 4)(p — 5) 4) 1 
2 2p — 4) 2 0 
L 1 0 0- 
(p — 3) (p 3) 1] 
(p— 3) (p— 3)(p— 4) (p—3) 
(p — 3) (p — 3) 1 0 
0 0 
{ 0 2p — 2) 0 
a 0 (p — 2)(p — 3) 0 0 
2p — 2) 0 0 0 
0 0 0 0 


When p = 3, m2 vanishes and hence the simple rectangular lattice for 6 
treatments in blocks of 2 plots becomes a three-associate p.b.i.b. design. 

The p(p — 1) treatments are identified by a pair of numbers (h, 7) 
where h and 7 remain unequal but can take values 1, 2, --- , p. The 
treatment numbers may be written out as in Fig. 4. 


1 2 3 (p — 1) P 
h 
1 (1,2) (1,3) (1,p — 1) | (1p) 
2 (2,1) (2,3) (2,p — 1) | (2,p) 


(3,p — 1) 


P.B.I.B. DESIGNS 


There are l ‘‘x-replications” in each of which the rows of Fig. 4 are 
taken as blocks and / ‘“‘y-replications” in which the columns of Fig. 4 


t 
1 2 3 P Total 
h 
1 Ziz, Zs, Tip, 
2 Zn, Zap, 
3 Zu, Zap, 
Pp Zpi Type, Zp.. 
Total Z.2, Z.3, Z.p. 
FIG. 5. TREATMENT TOTALS OF THE 1 “z-REPLICATIONS”. 
1 2 3 p Total 
h 
1 Yi2, Yip. 
2 Yn, Yop. Y2.. 
3 yn. Yaz Yop. Ya... 
Pp Yp1 Yp2. Up... 
Total Ya Y.2. Y.3. Yue 


FIG, 6. TREATMENT TOTALS OF THE | “y-REPLICATIONS,” 


147 

| 

| | 

| || 

| 

| | 
| 


148 BIOMETRICS, JUNE 1952 
are taken as blocks. It will be noticed that the design is of the resolvable 
type. In the field, treatments are randomized within blocks, and blocks 
within replications. 

Let 2,;; and y,;; be the value of the character under observation for 
the j-th ‘“z-replication” and ‘‘y-replication’” respectively of the treat- 
ment (h, 7). Let 2,,;. and y,;. represent the total value for the / ‘‘x-repli- 
cations” and the l ‘“‘y-replications” respectively. These values and their 
marginal totals are represented in Fig. 5 and Fig. 6 respectively. 

The total for the h-th block of the j-th “‘x-replication” will be repre- 
sented by 2z,.; and of the j-th “‘y-replication” by y.,; . The total for 
treatment (h, 7) will be represented by 7,; = (Xa:. + Yni.). 

The Q, and Q/ defined in earlier sections will have to be replaced by 
Q,; and Qj; respectively as each treatment is represented by two num- 
bers (h, 7). Let ¢,; stand for the combined intra- and inter-block esti- 
mate of the effect of treatment (h, 7). The value of ¢,; is obtained from 
the equation 


= wD Bo Che Dis | Abe Bh Che Dis 
Quiz + wD) Bs Ch, Dis Al, Bi, Ch, Di, 
Quis + Bu Ch Dis Bla Cle Dia 


= (Say) 
where 


Af, = 2U(p — 2)(w — w’) + 2Up — 1)w’) 
Al, = —21(p — 2)(w — w') 


=0 
4a = 2U(p — 2)(w — w’) 
= —U(w — 

By, = Up — — w’) + — Iw’ 

By, = — 3)(w — w’) 

Bi = U2p — 5)(w — w') 
1 = 0 


Cx = —2l(w — w’) 
Cy, = 4U(w — w’) + 2l(p — 1)w’ 
= 2l(p — 3)(w — w’) 


(64) 


P.B.I.B. DESIGNS 


Di, = 0 

Di, = —lw — w’) (65) 
cont. 
Dis = — 5)(w — w’) + — 1)w’) 


The value of Aj , the lower determinant in (64), simplifies to the form 
As = — 1)?w*{(p — 1)(w + w’) + (w 

(66) 
Now, 


Qn = Tri (t.. + (67) 


Let Qs. = >> Q,; (over permissible values of 7) 
Q.; = >> Q:; (over permissible values of h) 


Then 
= + (yn. — y...) (68) 
Similarly 


Let us now introduce three quantities \, 4 and y defined as follows: 


(74) 


(75) 


149 
w— w' 


150 BIOMETRICS, JUNE 1952 
The value of é,, given in (64) can be reduced to the form 

1 

hi = + w’Qii) + (A — w){wQ,. + Q.,) 


+ w'(Qi. + Q)} — + + w’'(Qi. + (76) 
This can be further simplified to the form 
1 
hi = 21 {Ti + (A = 
(x... + y...) 
2lp(p — 1) 
V, the variance of the difference between two first associates is the 


value of 2kA//Aj when the elements of the first column of Aj are re- 
placed by 1, —1, 0, 0. This reduces to 


+ wn. — a. + 2%... (77) 


(78) 


Vv, the variance of the difference between two second associates is the 
value of 2kA//Aj when the elements of the first column of Aj are re- 
placed by 1, 0, —1, 0. This reduces to 


Ve = {1+ 20 (79) 


(There will be no second associates when p = 3.) 

V, the variance of the difference between two third associates is the 
value of 2kA{/Aj when the elements of the first column of Aj are re- 
placed by 1, 0,0, —1. This reduces to 


1 
vV, the variance of the difference between two fourth associates is the 


value of 2kA//Aj when the elements of the first column are replaced by 
1, 0, 0,0. This reduces to 


1 
Y= = (1 + 2a) (81) 
The mean variance for differences of all pairs of treatments is 


Cochran and Cox (1950) as also Robinson and Watson (1949) suggest 
one common variance for comparing two treatments not occurring in the 


P.B.I.B. DESIGNS 151 


same block, thereby pooling up all 2nd, 3rd and 4th associates into one 
class. The common variance proposed by them is the unweighted 
average of (79), (80) and (81) which is 


2d») (83) 


They then calculate an approximate mean variance for all comparisons 
by taking the weighted average of (78) and (83) which comes out to be 


To the present author it seems more appropriate to use, instead of 
(83), the weighted average of (79), (80) and (81), namely, 


lw — 3p + 3) 
as the common variance for comparing two treatments not occurring in 
the same block. By doing so, there is no need for (84), as the weighted 
average of (78) and (85) will be the actual mean variance V given in (82). 
To obtain the expression for efficiency of the simple rectangular 
lattice, we have to calculate the mean variance 20;/r for randomized 
complete blocks. Substituting ¢ = p(p — 1) and k = (p — 1) in (16) 


for this becomes 
6 
(86) 


Expression for the efficiency of the design reduces to the form 


2p(p—1)(p—2)7" 


(87) 


It can easily be verified that this expression is >1. 

For preparing the table of analysis of variance for the simple rectan- 
gular lattice as in Table 1, we must obtain expression for ¢,; , the intra- 
block estimate of the effect of treatment (h, 7). This is obtained by 
putting w’ = 0 ory = 1 on ther. h. s. of (76) or (77). We can express 
ts; in either of the two forms given in (88) or (89) 


+0.) Qa +0 | 8) 


2 


1 


BIOMETRICS, JUNE 1952 


The sum of squares due to treatments (eliminating blocks) is 
1 1 
= b> Qi: (> + > 


~ @.. 
p(p — 2) | aed 


The sum of squares due to blocks (eliminating treatments) is 


1 2 
21 (ni. + Yri.) (91) 


Since the design is resolvable, we can eliminate from (91) the part due 
to replications as done in Table 2 of the author’s (1944) paper. 
The sum of squares due to replications is 


1 - 2 


Hence the sum of squares due to blocks within replications (eliminat- 
ing treatments) is got by subtracting (92) from (91). It can be split up 
into two components (a) and (b). Component (a) is 


1 2 2 1 2 2 


1 2 2 1 2 2 


and represents the sum of squares between similar blocks of the 1 
“z-replications” and of the I “‘y-replications”. It has 2(l — 1)(p — 1) 
degrees of freedom and disappears when / = 1 i.e. when only two repli- 
cations are used. 

Component (b) is obtained by subtracting the sum of (92) and (93) 
from (91) and reduces to 


— pa. 0. — Zs.) — — y...)} (94) 


152 

92 


P.B.I.B. DESIGNS 


This component has 2(p — 1) degrees of freedom. 
We may present the analysis of variance as in Table 12. 
According to the author’s (1944) paper the estimates of w and w’ are 


| obtained from this table as follows: 


Hence estimates of \, u and y are 


\(E, — E.) 


E, + (lp — p+ DE. 


IME, — E.) 


IpE, + (lp — 2l— p+ DE. 


WE, — E.) 


Y= + DE. 


(95) 


(96) 


(97) 


(98) 


TABLE 12. ANALYSIS OF VARIANCE OF THE SIMPLE RECTANGULAR LATTICE. 


\A=1 


Source of Degrees of Mean 
variation freedom Sum of squares Square 
Replications 21 —1 (92) 
Blocks within repli- |2l(p — 1) (91) — (92) or (93) + (94) E, 
cations (eliminating 
treatments) 
Treatments (elimi- |p? — p — 1 (90) E; 
nating blocks) 
Intra-block error (21 — 1)p(p — 2) > (255 + Yass) — (90) 
~@-}) 
E, 


tions as follows: 


The results obtained in this section can be linked up with those given 
by Grundy (1950) by replacing the notations of this paper by his nota- 


153 
| 
| 
= 
+ itu} 


BIOMETRICS, JUNE 1952 


Notations in Grundy’s 
‘this paper notations 
l r 
(p + 1) 
Brix 
Y By ix 
B,; 
B,; 
Yn.. 
Ru 
Ry 
2... S, 
S, 
Ti: Vii 


Finally, I wish to express my sincere thanks to Miss Gertrude M. 
Cox for reading the manuscript at an early stage and for making valuable 
comments. Sections 5 and 6 of the paper were added on her suggestion. 


8. SUMMARY 


When p.b.i.b. designs were developed by Bose and Nair in 1939 they 
concentrated attention on the construction and analysis of designs in- 
volving either two or three accuracies in the treatment comparisons. 
Detailed formulae for the variances of treatment differences, efficiency, 
etc. were therefore obtained only for these cases. In the present paper 
these formulae have been worked out in detail for designs involving four 
accuracies in the treatment comparisons. 

Since the simple rectangular lattice is a p.b.i.b. design involving four 
accuracies, the statistical analysis of this lattice design has been per- 
formed to illustrate the general method of analysis of p.b.i.b. designs 
involving four accuracies. 

For completeness, formulae for p.b.i.b. designs involving two and 
three accuracies have also been given. The simple square lattice whose 
analysis by direct method is well known to research workers has been used 
to illustrate analysis of a p.b.i.b. design involving two accuracies. 


REFERENCES 


Bose, R. C. and Nair, K. R. Partially balanced incomplete block designs. Sankhya, 
4, 337-372, 1939. 

Cochran, W. G. and Cox, G. M. Experimental Designs, John Wiley & Sons, Inc., 

New York, 1950. 


P.B.I.B. DESIGNS 155 


Grundy, P. M. The estimation of error in rectangular lattices. Biometrics, 6, 25-33, 
1950. 

Harshbarger, B. Rectangular lattices, Virginia Agricultural Experiment Station, 
Memoir 1, 1947. 

Nair, K. R. A note on the ‘Method of Fitting constants’ for analysis of non-orthog- 
onal data arranged in a double classification. Sankhya, 5, 317-328, 1941. 

Nair, K. R. and Rao, C. R. A note on partially balanced incomplete block designs, 
Science and Culture, 7, 568-569, 1942. 

Nair, K. R. The recovery of inter-block information in incomplete block designs. 
Sankhya, 6, 383-390, 1944. 

Nair, K. R. Rectangular lattices and partially balanced incomplete block designs. 
Biometrics, 7, 145-154, 1951 (a). 

Nair, K. R. Some two-replicate partially balanced desgins. Calcutta Statistical 
Association Bulletin, 3, 174-176, 1951 (b). 

Rao, C. R. General methods of analysis for incomplete block designs. Journal of 
the American Statistical Association, 42, 541-561, 1947. 

Robinson, H. F. and Watson, G. S. An analysis of simple and triple rectangular 
lattice designs. North Carolina Agricultural Experiment Station, Tech. Bull. No. 88, 
1949, 

Yates, F. A new method of arranging variety trials involving a large number of 

varieties. Journal of Agricultural Science, 26, 424-455, 1936. 


THE COMPUTATION OF SUMS OF SQUARES AND 
PRODUCTS ON A DESK CALCULATOR 


J. M. HAMMERSLEY 


Lectureship in the Design and Analysis of Scientific Experiment, 
University of Oxford 


Introduction. 


N computing a covariance matrix of n variates from observations 
2i.(t = 1, 2, +--+ , n;a = 1, 2, --- , A) the main labour lies in de- 
termining the sums 


A 
Si = ia i,j = 1, 2, (1) 
This paper, which is only exploratory, enquires into the best method of 
doing this when we have an ordinary desk calculator with sufficient 
capacity to cumulate products like 


(Xpal0° + + 


and obtain S,, , S,, + S,,, S,. as three distinguishable entries in the 
product register. For reference, such a cumulation will be called the 
run [p,qg,7,s]. Each run yields at most three pieces of information to- 
wards the 3n(n + 1) distinct sums S;; ; and at least one further piece of 
information is needed to check* these S;; . Hence to determine and 
check the S;; we need at least p) runs, where py is the least integer greater 
than n(n + 1)/6. If, in fact, some particular routine for calculating and 
checking the S;; requires p) + & runs, then ~ > 0 is an index of the 
efficiency of that routine. We also introduce a second index of efficiency, 
n, defined in the following way. Suppose in the original calculation 
(hereafter called the computing routine) of the S;; an error occurs in 
precisely one run. The check will show the existence of this error, but 
it may not indicate which run was at fault. We have to supplement the 
computing routine by an additional one (called the rectifying routine) 
which will locate and rectify the faulty run. We write » for the expected 
number of runs in this rectifying routine. A routine is efficient when it 
has a small pair of indices {£, 7}. In general, reducing one of the two 
indices may be at the expense of inflating the other. If a computor 


*A check is a calculation with two possible types of result, success or failure. If the check is 
successful, there should be a high assurance that every one of the S;; is correct. The degree of assurance 
in some loose fashion measures the rigour of the check; but assessment of this degree is a matter of 
personal judgment. For example, I personally consider that the checks described in the second para- 
graph of this paper are more rigorous than those described in subsequent paragraphs. 


156 


COMPUTATION ON DESK CALCULATOR 157 


works carefully, the check should succeed; and it will pay to make é 
small at the expense of 7. With this in mind, the emphasis in this paper 
is on minimising 7 subject to the condition = 0. The introduction of 
the second index 7 as a measure of the efficiency of a routine is, I believe, 
a new idea. The value of 7 depends upon the design of the rectifying 
routine; but it also depends upon the design of the computing routine: 
this is the main lesson taught by the present paper. 

Two routines, neither of which are as efficient as possible, will 
illustrate some of the foregoing ideas. In each of these two routines we 
begin by determining an additional variate 


i=1 
In the first routine we then calculate S;; for <i 
Following a procedure of Jowett [J. Roy. Statist. Soc. (B) LX (1949) 89-90] 
we may do this in p, runs, where p, is the greatest integer not exceeding 
(n + 1)(n + 2)/6. The check for this routine is 


= Sis #=1,2,---,n. 
If S,, is in error, this check will fail for? = wandi =v. The check thus 
locates the source of trouble; and one additional run will rectify matters. 
Hence 


{é, = {or — po, 1} = + J), 1}, 


where p; — po = 3(n + 1) may be approximate. For the second routine 
we calculate S;; for 1 <i <j < nandi = j = n + 1, and apply the 
check 


t,j=l 
Following Jowett’s method we can do this in p) runs. The check however 
gives no information where any error, disclosed by it, lies; and we must 
repeat the runs from scratch until we discover the trouble. Hence 


n} = {0, 30} = {0 + Dv}, 


the last expression being approximate. 
I have not been able to discover a routine of optimum efficiency, 
although evidence points towards an index of the type 


n} = {0,2 log.n}, 38<a< 10, (3) 


158 BIOMETRICS, JUNE 1952 


being optimum. In support of this conjecture I shall analyse the par- 
ticular cases in which n = 3m + 2, m being an integer. I have not 
managed to solve the cases in which n # 3m + 2; although, for the 
special cases n < 6, (3) can still be satisfied. 


The chain routine. 


We shall assume in this paper that a sufficient check of the run 
[p,9,7,8] is 


either (i) a check of the central item S,, + S,, in the product 
register 


or (ii) a check of both the end items S,, and S,, in the product 
register. 


We call these conditions (i) and (ii). In passing, however, let us mention 
two ways in which condition (i) is not completely rigorous. First, one 
of the counter wheels handling S,, or S,, may misbehave: a machine 
without complete tens transmission in the product register is rather 
prone to this shortcoming if some of the x;, are negative. Second, con- 
sider the specific case in which z,,10‘ + 2,, is correctly set upon the 
keyboard and correctly multiplied by z,,10‘. Then, in traversing the 
carriage, let x,. be accidentally cleared from the keyboard before starting 
the multiplication by z,,. Asa result S,, + S,, will be correct, but 
S,. will be short by 2,.%,. . On a machine, such as the automatic 
Marchant, where the tabulating keys are close to the keyboard, clumsy 
manipulation of the tabulating keys for traversing the carriage may pro- 
duce this failure. However (i) and (ii) do guard against the most preva- 
lent source of error, namely incorrectly set multiplicands or multipliers. 

We shall describe the analysis of the case n = 3m + 2 in geometrical 
language. Consider an array T of points P,; (i, 7 = 1, 2, --- , n) in 
which P;; and P;,; are alternative names for one and the same point. 
We divide the array into two parts, 7’, consisting of the three points 
P,, , Piz, P22 , and T, consisting of all remaining points. With the run 
[p,q,7,8] we associate the four (not necessarily distinct) points P,, , P,. , 
P., , P,, and a curve (hereafter called a link) joining P,, to P,,. We 
shall say that this curve is the link of the run [p,q,7,s], that P,, and P,, 
are the link points of this link, and that P,, and P,, are the side points of 
this link. In case the two link points are not distinct, we take the link 
to be a closed curve passing through the (coincident) link points. 

In the routine that we shall describe, the first three runs are [1,2,1,2], 
[1,2,2,3], and [1,2,3,5]. The first run yields S,, , 2S,., S22 (in that order 
on the product register); the second yields S,, , S,;3 + S22, S.3 ; and the 


COMPUTATION ON DESK CALCULATOR 159 


third yields S,; , S;; + S23, S.;. We call these three runs the prelim- 
inary runs. 

Beginning now with the point P,.; (called the origin), we construct a 
chain of links between certain pairs of points in the array 7, . The first 
link runs from P,; to P;; , the next link from P,; to P,; , the succeeding 
link from P,; to P3; , and so on with each fresh link starting where the 
previous link ended. Finally the chain is to close by the last link return- 
to the origin P,; . In forming this chain we stipulate 


(a) the chain shall not pass through any point more than once: 
that is to say each link point of the chain belongs to just two 
links, the link leading to it and the link leading from it; 


and (b) any point of the array 7, which does not lie on the chain is a 
side point of precisely one link of the chain. 


The following diagram illustrates one such chain for the case n = 14. 


i 
1 23 4 5 6 7 8 9 10 11 12 13 14 
Of of Oo of & 
o of o of o Of 6 
o o pf o 7 
@ Of @ Oo of 
o & of oc of 9 
o Oo Of} ll 
o & of 12 
o W 
o 14 


160 BIOMETRICS, JUNE 1952 
In the above diagram of T for n = 14, the point P;;(z < j) lies in the ith 
row and the jth column. The reader may verify that this chain does 
satisfy the requirements (a) and (b) if he notes that the two side points 
of a link are a diagonally opposite pair of points of the rectangle whose 
other two vertices are the link points of the link, subject to the condition 
(implied by P;; = P;;) that any part of such a rectangle falling below 
the diagonal edge of 7’, has to be reflected in this edge. Thus the side 
points of the link P.;P,; are P,; and P,; , whilst the side points of the 
link P,;P,; are P,, and P;;. He will also see how the diagram and chain 
extend to any value of n = 3m + 2; and he will confirm that the runs 
defining the links of such a general chain are 


[3k + 1, 3k + 2, 31, 31 + 2] 0<k<l-1<m-1) 

[3k + 3, 3k + 2, 31, 314+ 1] O0<k<l-1<m-1 

[3k + 4,3k+3,31+1,314+2] O<k<1-3<m-3 

(1, 32 + 2, 31 + 1, 31 + 2] 1<l<m (4) 
[31 — 3,31 —2,31+2,31-2] 2<l<m 

(31, 31 + 1, 31+ 2, 31 + 4] 1<l<m-1 

[3m, 3m + 1, 3m + 2, 3m + 1] } 


In examining the structure of this chain, the reader may prefer to pass from the defin- 
ing sequence of runs (4) to the links of the chain rather than vice versa, since to each 
run there corresponds a unique link whereas each link corresponds to two distinct 
sets of four runs each. The computation of covariance matrices, with keyboard and 
multiplier register each carrying two variates, involves the disposition of four symbols 
(the four variates) and hence is an instance of the geometry of a complete quadri- 
lateral. Any four variates p, q, r, s represent the sides of this quadrilateral, the sides 
of whose diagonal triangle determine the three possible associated links P,sP¢r , 
PyqP rs , PprP os ; Whilst the opposite vertices of this diagonal triangle determine the 
three possible pairs of side points P,, and P,, , Pyg and P;s , Pp, and P,,. The link 
P,;P qr corresponds to the set of four equivalent runs 


[p,9,7;8], {9,P,8,7], [r,8,p,9], {s,7,9,p] 

when its side points are P,, and P,, , although it corresponds to the runs 

[p,r,9,8], [r,p,8,9], [9,8,7,7], [s,9,7,0] 
when its side points are P,, and P,,. Indeed the suffices of the side points of either 
set generate equal cross-ratios for the runs of the other set. Dually the pair of side 
points P,, and P,, corresponds to the set of runs 

[p,9,8,r], [s,r,9,9], [r,8,4,0] 
when their link is P,,P sq , although it corresponds to the set 


[p,r,s,9], r,p,9,8], [s,9,7,7], [7,7,9,8] 


when their link is P,,Ps, ; where once again the suffices of the link points of either set 
generate equal cross-ratios for the runs of the other set. 


COMPUTATION ON DESK CALCULATOR 161 

In carrying out the run to which any given link of the chain corre- 
sponds, we shall determine S,; for the P;; which are the side points of 
that link; and moreover we shall determine the sum of the two S,; for 
the P;; which are the link points. The preliminary runs yield the value 
of S.; for the origin P.; of the chain. Hence, knowing S.; + S,; from 
the third preliminary run (which is also the first run of the chain), we 
can determine S,; by subtraction. Thereupon, S,; + S,; provided by 
the second run of the chain yields S,;. Next the third run of the chain 
provides S,; + S37; , and thus gives S;,;. Link by link progress in this 
fashion along the chain yields S,; for all link points P;; of the chain. 
The last link of all returns to P,; , and hence the final run of the chain 
yields a second determination of S,; . Disregarding the improbable 
event of two mutually compensating errors arising in independent runs 
of the chain, we see that the two determinations of S.3 will agree if and 
only if the central items S,, + S,, of every run [p,q,7,s] of the chain are 
correct; and hence, by condition (i), this agreement checks all the runs 
of the chain and therefore S;,; for all the side points of T,. It also checks 
S,, for all the link points of T,, provided that S,.; is correct. Now P3 isa 
side point of the first link of the chain; so S,; is checked. Agreement on 
S,. between the first two of the preliminary runs checks the first pre- 
liminary run in accordance with condition (i), and hence checks 8,» . 
Therefore S,; + S.. is checked, and this checks the second preliminary 
run. Thus S,; is checked. Whereupon S;; for all link points of T, are 
checked. This completes the checking of all S;; in T. Reference to (4) 
shows that the total number of runs in the computing routine is 


2+ (Sr) +1 
= + 2)(m+ 1) +1 = po; 
and so 
(5) 


Let us now turn to 7, by examining the consequences of a single 
faulty run. Consider the table on page 162. This table contains four 
main columnar blocks corresponding to the cases in which the faulty run 
is any one of the three preliminary runs or is any other run. By hypothe- 
sis, only one run can be at fault. The columnar block corresponding to 
the run [1,2,1,2] has three columns. Of these, the column marked ‘None’ 
represents the case in which run [1,2,1,2] is faulty and none of the three 
results S,, , 2S;2 , Soo are correct; the column marked ‘S,,’ represents 
the case in which run [1,2,1,2] is faulty although the result S,, is correct. 
In this case, we may assume that S,, and S.. are both incorrect because 


BIOMETRICS, JUNE 1952 


Faulty run and correct sums 


Check [1,2,1,2] (1,2,2,3] [1,2,3,5] Other 
None| Su | || None} | Ses || None} Sis | Sos |} Any 
Si F | F | F F 8 
Sis + Sx F | F | F F 
Odd chain F| F} F| F F 
Even chain 8 s 8 F F | F F 


otherwise conditions (i) or (ii) respectively would be violated. The 
remaining columns of the table carry similar interpretations. The rows 
of the table correspond to the various checks available. The first row 
belongs to the check on the two distinct values of S,. provided by runs 
{1,2,1,2] and [1,2,2,3]. The second row deals with the check of S,3 + S22 
yielded by [1,2,2,3] against S,; yielded by [1,2,3,5] and S.. yielded by 
{1,2,1,2]. The check, arising from a proper numerical closure of the chain 
upon returning to P,.; , lies in the third or fourth row according as the 
chain in 7, has an even or an odd number of links. The number of links 
in the chain will be odd if and only if m = (2 or 3) + (multiple of 4). 
In the body of the table, S and F respectively denote success and failure 
of the check. The table picks out the source of trouble. For instance, 
if all three checks fail when there are an odd number of links in the chain, 
run [1,2,2,3] must be at fault. If the first two checks fail but the chain 
closes properly, then either [1,2,1,2] or [1,2,2,3] is at fault, though we 
cannot say which. If the first two checks succeed while the third fails, 
then an unknown link of the chain (including [1,2,3,5] as the first link of 
the chain) is faulty. Let us assume that all runs of the computing rou- 
tine are equally likely to be the faulty one; and that the three possible 
types of error in each run are equally likely. If the results of the checks 
show that one of two specified runs is at fault, we first repeat one of 
them chosen at random; so that the expected number of runs needed to 
rectify this error will be 3/2. An easy calculation (based on the table 
above) then shows 


n= 5+ po 2%), (6) 


where ¢ is the number of runs needed to rectify a faulty run at an 


COMPUTATION ON DESK CALCULATOR 163 


unknown point of the chain. It remains to determine ¢, for which pur- 
pose we may assume that the chain possesses just one faulty link. 

Let Q.(K = 1, 2, --+ , po — 2) denote the link points of the chain 
taken in order of occurrence, starting with Q, = P23,Q2 = Pis,Q3 = Ps, 
+++. Of these points let Q’ and Q” be such that the number of links in 
the three parts of the chain Q,Q’, Q’Q”, Q’Q, are as nearly equal as 
possible. For example if pp — 2 = 34 corresponding to the case n = 14, 
we may take Q’ = Qi. = P22.) and Q” = Q.; = P,,14. Then for the 
first run of the rectifying routine, carry out a run with Q’ and Q” as 
side points: in the instance cited, [2,1,9,14] suffices. This will yield S,; 
for the points Q’ and Q’’.. According as the value of S;; for Q’ agrees 
or does not agree with that previously determined in the original routine, 
we see that the faulty link respectively does not or does lie in the section 
of chain Q,Q’. If there is agreement at Q’, then disagreement or agree- 
ment at Q” shows respectively that the section Q’Q” is or is not at fault. 
If there is agreement at both Q’ and Q” then the section Q’’Q, contains 
the faulty link. We next divide the faulty section into three nearly equal 
parts by two points Q’” and Q’’”; and we let the second run of the recti- 
fying routine have Q’” and Q’”” as its side points. Proceeding in this 
fashion we shall progressively shorten the section of chain known to 
contain the fault until we ultimately detect and rectify the single faulty 
link. A little reflection will show that, if ¢(A) is the expected number of 
rectifying runs needed when the faulty section of chain has A links, then 


b 
(4) atb+ce=A>4, (7) 
where a, b, c are the number of links in the three nearly equal sections 


into which the chain is broken. We show in the appendix that an 
approximate solution of (7) is 


= 2 + logy A. (8) 


This approximation is on average nearly correct, and never is in error by 
more than 1/6. Substitution into (6) then yields 


2 
= logs 2) + tog, (—25) (9 


approximately. Since p. = n(n + 1)/6 approximately, 7 behaves like 
2 log; n for large n; and this confirms the weakest form of (3). 

I have not succeeded in determining the strongest form of (3). That 
stronger forms of (3) can sometimes exist is evident from the following 
study of the particular case n = 14. According to the foregoing rectify- 


164 BIOMETRICS, JUNE 1952 
ing routine we should have ¢ = ¢(34) = 141/34. Consider however the 
following alternative rectifying routine. Suppose that the checks show 
that some link of the chain is faulty, the discrepancy in the check being «. 
Carry out first the run [4,12,11,5]. A study of the diagram will show 
that this breaks the chain into five sections instead of three. The run 
yields three sources of agreement, namely on S41; , Si,5 + Sii.12, and 
Ss.12. The possible results of these checks are 


, Links containing the faulty link 
Check 
1,2 3,4,...,8,9 |10,11,...,17,18) 19,20,...,24,25)26,27,...,33,34 
Sau € € 0 0 0 
Sas + 2e € € 0 
S312 € € € 0 0 


The body of this table shows the respective magnitudes (0, ¢, or 2) of 
the disagreements in these checks corresponding to which one of the 
five sections contains the faulty link. Since these five possible results are 
distinct we can in this one run locate the faulty section. The next table 


specifies the appropriate second run, according to the section located by 
the first run. 


Faulty section indicated Faulty sections discriminated 
by first run Second run by second run 
1,2 (1,2,3,5] 1;2 
3,4,...,8,9 (1,7,8,3] 3;4,5;6;7,8,9 
10,11,...,17,18 [2,13,9,6] 10,11;12,13,14;15,16,17;18 
19,20,...,24,25 (3,14,13,1] 19,20 ;21,22;23 ;24,25 
26,27,...,33,34 [8,5,9,6] 26,27,28 ;29,30,31 ;32,33,34 


These two runs locate the faulty link in sections of one, two, or three 
links. We can complete the rectification as before. Using equation (12) 
of the appendix, we deduce ¢ = 59/17 for this special rectifying 


COMPUTATION ON DESK CALCULATOR 165 


routine. By substitution into (6) we find 7 = 3.4760 --- for the special 
routine, whereas in the previous general rectifying routine 7 = 4.1337 --- 

A rectifying routine of the foregoing type, which at each and every 
run subdivided each section of chain into five subsections, should (for 
large values of n) approximately satisfy (3) with a = 5. I doubt whether 
this is possible for the type of chain discussed previously; but it might be 
possible for some suitably devised chain. Since only four sums S;; can 
appear at any one run, it is not possible to break the chain into more than 
five pieces at a time. 

However if we notice that there is an essential difference between a 
rectifying routine and computing routine, we can develop a method 
which will subdivide sections of a chain into ten pieces at each run. To 
fix the ideas in this development, let us suppose that the machine has a 
capacity of 10 digits in each of the keyboard and multiplier register and 
20 digits in the product register, and that each observation z;, is a 
3-digit number. In the rectifying routine we now handle six variates 
at a time in cumulations of the type 


(%pa10° + + + + 
leading to the result 
S410"? + + S.)10° + (Sp + Soe + S,.)10° 
+ (S,. + S,)10° + (10) 


in the product register. These sums will naturally intermingle a good 
deal; but, as we shall see presently, this is of little consequence. Suppose 
now it is possible, for a suitable choice of chain, to choose p, g, 7, s, t, uso 
order on the chain. Let I denote the section of chain between the origin 
of the chain and P,, , let II denote the section from P,, to P,, , and so 
on for sections III, IV, --- , LX, until section X is that between P,,, and 
the origin. Let us suppose that in the original computing routine the 
central entry S,, + S.2 of some link (i.e. the sum of S;; for two consecu- 
tive link points) is in error by a quantity «. If from (10), produced in 
the rectifying routine, we now subtract the corresponding expression, 
produced from the S;; of the computing routine, the product register 
will contain one of the following entries according to the section which 
contained this faulty link. 

Now we know the value of ¢; for it is the amount by which the chain 
fails to close upon returning to the origin. Hence we may divide the 
entry in the product register by ¢; and this will vield one of the coefficients 
of € appearing in the following table. These coefficients are all different, 


BIOMETRICS, JUNE 1952 


Faulty section Entry in product register 
I 1002003002001 « 
II 2003002001 « 
III 1003002001 « 
IV 1002002001 « 
Vv 1002001001 « 
VI 1001001001 « 

VII 1001001 « 
Vil 1001 
IX € 
x 0 


so we shall identify the faulty section. In the case n = 14, we may set 
for the first rectifying run the variates 2, 5, 13 on the keyboard and 6, 9, 
12 in the multiplier register (reading in each case from left to right). This 
splits the chain into 10 parts and leads to an eventual value 
n = 2.8974 ---. 

Throughout this paper we assumed for simplicity that at most one 
run could be faulty. It is, however, only in this last paragraph that this 
assumption becomes a serious limitation. Elsewhere we may usually 
take the index 7 of the rectifying routine to be the expected number of 
runs per faulty run. We may also notice that our routines do not involve 
calculating any additional variates such as that specified by equation (2). 


Special cases. 


I give below computing routines for the special cases n = 3, 4, 6. 
Although they are ad hoc solutions, they may be useful in practical 
applications, or they may suggest to the reader a general solution for the 
cases n * 3m + 2. The case n = 5 isa special case of n = 3m + 2, 
which we have already treated. 


1. n=8. [1,2,1,2], [2,3,2,3], [3,1,3,1]. 
2. n= 4. [1,2,1,2], [3,4,3,4], [1,4,3,2], [1,2,4,3]. 
3. n=6. [4,3,5,6], [5,4,5,6], [6,5,6,1], [1,6,1,2], 
[2,1,2,3], [3,2,3,4], [4,3,4,5], [1,2,4,5]. 
Appendix. 
From (7) we have 


Ag(A) = A + aga) + (11) 


COMPUTATION ON DESK CALCULATOR 


We also notice 
3 5 


For instance, if the faulty section has three links, the rectifying procedure 
will locate the fault whichever the faulty link, and moreover will rectify 
it when the central link is at fault. A second run would however be 
needed to rectify either end link. We solve the partitional equation (11) 
by the substitutions = xf(x), AO(x) = O(a + 1) — A(2). 

We have, for integral z, 

6(32) = 3s + 0(x) + 6(x) + z>1 
+ 1) = 3x + 1 + A(2) + + + 1), z2>1 
+ 2) = 34 + 2+ A(z) + 1) + 1), z>1 
032 + 3) = 4+ O@4+), 
and consequently 


1 + A@(x) = AO(Bx) = + 1) = + 2), z>1, 


Hence, if a;(j = 0, 1, --- , v) represent 0, 1, or 2, with the restriction 
ao 0, 
v v-1 
aol =1+ aol = (v — 1) + + a) 
i=0 i=0 


(13) 


v + 2 otherwise 


where the last part of this equation springs from (11) and (12). But 
now, if vy > 1, 


6(3”) = 3” + 30(3""") = 2.3” + = (v — 1)3” + (14) 
= 
Also by the definition of A@(zx) 


+ 2) = 08°) + (15) 
Combination of (13), (14), (15) now yields 
+ 2) = + 3) + xv + 3), 
+ 1) + + 2), 23" 


168 BIOMETRICS, JUNE 1952 


Finaily we have 
63’ +2) = i (16) 
1 + 3°” .2x v-1 
v+ 23 


Equation (16) has been established for » > 1; but it is easy to verify from 
(11) and (12) that it remains true for vy = 1. It is false for » = 0; but 
then equation (12) specifies the values of ¢(1) and ¢(2). Although (16) 
is exact, it is rather complicated. We can however deduce from it that 
¢(y) — logs y oscillates fairly regularly between 2/3 and 5/4 — log, 4/3 = 
0.9881 --- ; and this is the justification for equation (8). 


QUERIES 


GrorcE W. SNEDECOR, Editor 


‘QUERY: [have often heard arguments advanced by animal hus- 
94  bandmen and others who believe in sorting experimental material 
so as to remove pre-treatment variation between average values 
of some chosen variable. On the occasions when I have run up against 
these proposals, I have conceded the point only on these conditions: that 
the experiment was so designed (as in a randomized block) that all the 
mean squares including that for experimental error were derived from the 
lot means rather than from the individual animals; and that the experi- 
menters could assure me that the outcome of individual animals was of 
minor interest to them. Even then my own feeling has been that the 
experiment would be more informative if, after the experimental animals 
had been divided into outcome classes or “blocks”, the animals were 
assigned at random within the blocks. Then, with pre-treatment meas- 
urement of any selected concomitant variable, the remaining experi- 
mental error could be placed under statistical control. 

Recently I read an article, advocating the presorting of experimental 
material for the Latin Square design. The illustration was taken from 
an experiment by Stanley D. Nisbet reported in the British Journal of 
Educational Psychology, Volume 9, 1939. To evaluate four methods of 
testing spelling ability, 80 pupils were first given a standard test and 
then, based on the results, were divided into four groups equated with 
regard to all details—total marks, range, and standard deviation... Four 
lists of words were used. Each group received the lists of words in a 
systematic order designed to equalize the effect of learning. The variable 
studied was the number of words wrong in the original dictation but 
right in the later test. The analysis of variance is given in the following 
table: 


Sums of Mean 

Source of Variation D.F. Squares Square 

Lists of words (rows) 3 359.5 119.8 

Groups of pupils (columns) 3 74.5 24.8 

Tests (treatments) 3 4626.5 1542.6 

Error 6 606.5 101.1 
Total 


169 


170 BIOMETRICS, JUNE 1952 
From the mean squares shown in this table, one might infer that this 
presorting of the pupils resulted in the removal of experimental variation 
from the variance between groups. As I understand it, such a procedure 
may result in assigning this experimental variation to the error term of 
the analysis, so that the mean square for error is likely to be inflated. 
My thinking is that the concomitant variable should be used in a 
covariance analysis rather than to sort out the groups so as to remove 
variation before the study is started. 
I shall appreciate your comments. 


Your discussion involves two devices for increasing the 
ANSWER: efficiency of experiments, the grouping of experimental 

material and the use of concomitant variables. I know 
your remarks are not intended to apply generally but that it is in the 
two cited instances where you question the appropriateness of the group- 
ing and where you think the use of a covariate might produce more 
information. Let me comment first on the method of grouping which 
you criticize. 

The “balancing”’ of lots, if effective, presumably increases the preci- 
sion of the comparison of treatments but at the same time precludes the 
evaluation of the precision. Tests of significance are vitiated by the 
non-random allotment and by the increase of the variation within lots 
over that expected in random sampling. You have called attention to 
the latter, the tendency to inflate the mean square for error. Nisbet 
confined himself to estimating treatment effects, refraining from tests of 
significance. This is a procedure with which I have no quarrel though it 
is not one to be recommended generally. 

It was after Nisbet published his results that the latin square analysis 
was made. Nisbet used the design merely to insure that each group 
once and once only received each list for each test. The proposed analy- 
sis of variance introduced five difficulties which had been more or less 
avoided in the original report. First, there is the non-random arrange- 
ment of the latin square. Second, there is the lack of independence 
suspected when treatments are tried successively on the same group. 
Third, there is the grouping bias in the test of significance, already 
mentioned. Fourth, there is the introduction of a new variable, not 
considered by Nisbet, having no obvious correlation with the test score 
used for selecting the groups. Finally, there is the manner of selection 
of the experimental material, inappropriate for the latin square design. 
Since your query is about the selection of groups, it is the final difficulty 
that calls for discussion. 

One of the virtues of the latin square is that it provides for segrega- 


QUERIES 171 


tion of anticipated random variation in two dimensions. Any foresee- 
able differences among rows and among columns are eliminated from the 
estimate of error. This provides the experimenter with a device for using 
knowledge of his material to make groupings which may increase the 
sensitivity of the experiment. If this experiment had been planned to 
test some hypothesis about the scores on the four spelling treatments, 
one effective way of utilizing the standard pretest would have been this: 
form 4 groups ranked from high to low on pre-test performance, assigning 
one group to each column of the square; randomly divide each group 
into 4 sub-groups of 5 pupils, then try one treatment on each subgroup. 
With a suitable criterion for the grouping, a sizable mean square for 
columns could be expected with a corresponding reduction in error. An 
alternative design would be the graeco-latin square for which 16 
groups of students could be selected according to rank in the pre-test. 
These suggested pre-selections would cause some deviations from the 
normal theory of the models but, for practical purposes, randomization 
would seem to be a sufficient safeguard (see Chapter 8 of Kempthorne’s 
“Design and Analysis of Experiments’). 

Coming now to your second point, the relative merits of preselection 
and covariance; I am not aware of any definitive investigation of this sub- 
ject. Naturally, one would use qualitative differences (sex, litter, soil 
differences, etc.) as a basis for preselecting groups. Equally, I suppose 
it should be natural to use quantitative variables as covariates, but it is 
often not done, perhaps because of the burdensome calculations. Also 
to be considered are questions about linearity of regression, parallelism 
of the group regressions, homogeneity of variance, and so on. Another 
thing to be taken into account is the size of the correlation between the 
criterion of selection and the experimental variable. In my experience 
it is not great. Once I tried both grouping and covariance in a pig nutri- 
tion experiment where the correlation turned out to be 0.5. The results 
were substantially the same; but I assume that somewhat more informa- 
tion might be expected from the covariance scheme, especially if the 
correlation is high. Ultimately I suppose, if the theoretical conditions 
are favorable, the choice would rest on the relative costs of increased 
replication and additional computation. 


ABSTRACTS 


THE BIOMETRIC SOCIETY—ENAR 
4th Annual Meeting—Dec. 27-29, Boston, Mass. 


CLYDE P. STROUD (University of Chicago and Argonne 
177 National Laboratory). An Application of Factor Analysis to the 
Systematics of the Genus Kalotermes. 


By means of multiple factor analysis and related techniques the 
genus Kalotermes (dry-wood termites) was examined in an attempt to 
elucidate its systematic structure. By its use five first-order group- 
factors were identified for the soldier caste and six were established for 
the imago (reproductive) caste. Interpretations of a sample of these 
are presented showing their relation to evolutionary trends in the genus, 
to regressive evolution, and to intra-caste and inter-caste pleiotropism. 
Factor analysis seems to be of great aid in evaluating characters and 
relating them to evolutionary trends. 

An extension of the methods of multiple factor analysis is developed 
which permits the rotational search for groupings among individual 
evaluation points in the common-factor evaluation space. The existence 
of a previously suspected genus or subgenus, Proglyptotermes, is demon- 
strated in the soldier caste. 

Multiple factor analysis, as developed chiefly by L. L. Thurstone, 
is a useful exploratory method for the investigation of many kinds of 
multi-variate domains. Although its use is not limited to domains in 
which usual statistical assumptions can be made, it is desirable that a 
further relation to statistics be developed. 

The centroid method of factoring is discussed briefly. It is seen that 
Thurstone’s simple-structure concept is based on the definition of a 
group-factor. A brief set-theoretical interpretation of this concept is 
developed. The rotational process permits the identification of each 
group-factor with a subset of the variables on which it does not act and 
with the complementary subset on which it does act or with which it is 
associated. General factors are indicated by an absence of complete 
simple-structure at the first-order level or at higher-order levels. 

It is hoped that the methods of factor analysis can supplement the 
more usual systematic techniques and permit a better description of 
some of the millions of evolutionary experiments in which nature has 
engaged. 


172 


ABSTRACTS 173 


178 H. L. SEAL. Discrete Random Processes—The Probability 
Basis of Mortality Theory. 


In a certain community observation has shown that, if a group of 60 
men aged 45 are traced until they attain age 55 or die before reaching 
that age, about 10 of the men will fail to reach age 55. Repeated experi- 
ments with samples of 60 forty-five-year-old men have resulted in num- 
bers of deaths ranging from five to fifteen, but in one instance as many as 
20 deaths were recorded. 

It has been suggested that the same kind of results might be obtained 
if 60 dice were thrown simultaneously over and over again, the number of 
deuces being recorded at each throwing. Here again we would expect 
to see 10 deuces appear at the average throw and would not be overly 
surprised at any number between 5 and 15; twenty deuces might make 
us suspect that the dice (or some of them) were biased. 

The object of the paper is to investigate mathematically whether 
such a suggestion is plausible. Although there are no statistical data 
quoted in support of the simple thesis advanced above, the author refers 
to an earlier paper in which he uses mortality data dating as far back as 
1671 to show that the theory is in excellent agreement with the observa- 
tions made. In this paper, however, the argument is developed in 
mathematical terms. 

The assumptions underlying the Death-as-a-game-of-chance idea are 
first analyzed. Different possible variations are proposed and their 
mathematical consequences investigated. As might be expected the 
generalized approach leads to more complex formulas. 

A more subtle generalization is involved by the consideration that 
mortality is a force that operates more strongly the older the individual. 
This means that the 1 in 6 mortality rate quoted above for ages 45-55 is 
composed of a continuum of rates, smaller than 1 in 6 at ages nearer 45 
and larger than 1 in 6 at the ages adjacent to 55. Phenomena which 
involve probabilities (mortality rates in this case) changing as time 
passes have been called “random processes” —hence the title of the paper. 
A special mathematical technique must be employed to solve the prob- 
lem thus posed. Somewhat surprisingly, the earlier “simple game of 
chance” formula appears as a special case of this new general approach. 

The author concludes by providing a few numerical illustrations of 
some of the formulas provided in the paper. The “simple game of 
chance” formula is shown to be fairly close to some of these other formu- 
las even when its underlying assumptions are far from being satisfied. 


q 
| 
Ad 


BIOMETRICS, JUNE 1952 


MARTA FRAENKEL, M.D. (Director Medical Statistics and 
179 Records Service, Department of Hospitals, City of New York). 
Hospital Records as Source of Morbidity Data. 


1. Medical diagnoses on hospital patients’ records serve primarily 
clinical and research purposes. Their use can be multiplied when 
diagnoses together with other pertinent data are collected and analyzed 
community-wide. 

2. Such aggregate information would indicate major morbidity 
trends in community. Information should be guide in planning hospital 
and other medical care services and public health research. 

3. The availability of a Standard Nomenclature, a Classified List of 
Diagnoses and a correlating device between both are significant mile- 
stones for materialization of routine morbidity reporting systems. 

4. By limiting morbidity reporting to hospitalized illness, significant 
professionally established data on all major illnesses can be expected. 


N. RASHEVSKY (The University of Chicago). Theory of 
180 
Organic Form. 


A mathematical biology of organic form can be developed from two 
different points of view. The direct one is through the development of 
the theory of cellular aggregates. Considering aggregates of greater and 
greater complexity, we should, in principle, arrive eventually at a 
theory of the most complex metazoans. Because of tremendous compli- 
cations and mathematical difficulties, this approach is at present prac- 
tically not feasible. 

Another, more formal approach is possible however. Different 
physiological functions of an organ or of a whole organism depend in 
general on the shape and size of the organ or organism. We may ask as 
to what particular shape is best for an optimum functioning. Those 
optimal shapes should actually prevail due to natural selection. In this 
fashion it is possible to derive some relations, both for plants and 
animals which describe the gross features of their forms. 


18] ANATOL RAPOPORT (The University of Chicago). Theory 
of the Central Nervous System. 


The development of a mathematical theory of the central nervous 
system has progressed along two principal paths, the one employing 
continuous concepts, the other discontinuous ‘‘quantized” ones. An 
example of the continuous approach is the two-factor theory of Rashev- 


ABSTRACTS 175 


sky, subsequently developed by Landahl, Householder, and others, 
which in its mathematical formulation has formal resemblance to the 
two-factor theory of nerve excitation of Rashevsky-Hill-Monnier. The 
two-factor theory of the central nervous system leads to equations cor- 
relating numerous psychological and physiological measurements, for 
example the relation between stimulus strength and response time, 
accuracy of psychophysical discrimination, learning curves, memory 
curves, etc. 

An essentially different approach was initiated by McCulloch and 
Pitts, who have considered the action of individual neurons as composed 
of individual instantaneous events (firings) and have developed a calculus 
for describing the distribution of such events and the framework in 
which they occur (the net) on the basis of symbolic logic. 

There has been also a re-interpretation of the two-factor theory in 
terms of a probabilistic treatment of the discontinuous picture. 

The present paper deals with a third approach based on a generalized 
discontinuous model, which enables one to study a large neural structure 
called a ‘random net.” The theory deals with some aspects of the dy- 
namics of such a structure and also leads to equations from which psy- 
chological and physiological parameters can be inferred. The quantities 
so deduced are compared with those deduced from the continuous theory. 


182 GEORGE KARREMAN (The University of Chicago). Excita- 
tion and Threshold Phenomena in Irritable Tissues. 


A physico-chemical model of membrane permeability to K*-ions 
based on reactions of Ca**- and K’*-ions with an unspecified organic 
constituent of the membrane is studied. The diffusion of K*-ions 
through such a membrane is then treated and is found to be described 
by non-linear differential equations. It is shown that such a system 
may possess a threshold. Estimations of permeability increase during 
excitation and of electrical and chemical thresholds are shown to have 
the correct order of magnitude. Several relations between the derived 
thresholds and other physical and chemical variables are predicted. A 
derivation of the one-factor theory of nerve excitation is obtained on the 
basis of the physico-chemical mechanism considered here. The time 
course of the excitatiory disturbance for the case of chemical stimulation 
is derived theoretically from the above-mentioned model. The rather 
adequate agreement with experimental data on electrical stimulation of 
nerve may indicate that a common threshold phenomenon, such as 
considered here, is the basis of electrical as well as chemical excitation. 
More extensive mathematical studies of a similar mechanism indicate 


| 
j 
i 
4 
by 


176 BIOMETRICS, JUNE 1952 
that relaxation oscillations with a threshold may be based on such a 
model. The model gives a tentative explanation for the existence of 
subthreshold oscillations, single action potentials and repetitive dis- 
charges. The mechanism appears to give some indication in agreement 
with experimental data as to the dependence of oscillations upon the 
concentration of Ca**-ions. 


THE BIOMETRIC SOCIETY—BRITISH REGION 
Abstracts of papers for the meeting on February 21, 1952 


183 K. MATHER AND B. I. HAYMAN. The progress of inbreed- 
ing where heterozygotes are at an advantage. 


The effect of inbreeding is to raise the proportion of homozygotes and 
diminish that of heterozygotes in the population. In the absence of 
complications the final state is complete homozygosis, the speed of 
attainment of this state depending on the closeness of the inbreeding. 
With heterozygotes at a selective advantage, however, the gradual 
diminution which inbreeding tends to bring about in their frequency will 
be offset by the greater number of progeny that they leave in the next 
generation. This must always slow down the progress of change under 
inbreeding and, if the selective advantage is sufficiently great, will pre- 
vent the attainment of homozygosity. The selective disadvantage of 
the homozygotes necessary to prevent homozygosis varies with the 
mating system, being lower with closer inbreeding. With selfing it is 
50%, with sib-mating 15%, parent-offspring mating 13%, half-sib mat- 
ing 6%, and double first cousin mating 4%. If, therefore, close inbreed- 
ing leads to loss of the lines, owing to their low vigour and fertility, it is 
useless to attempt to secure homozygosis more gradually by a looser 
mating system. The result will be merely to stabilise the population at 
a higher level of heterozygosity. Conformity with the Hardy law for 
the frequencies of homozygotes and heterozygotes in a population, 
which is commonly taken as evidence of random mating within that 
population, may result from inbreeding counterbalanced by a selective 
advantage of heterozygotes. 


1 P. M. CLARKE. The statistical analysis of a slope-ratio assay 
of several preparations. 


A method of analysis for a symmetrical slope-ratio assay of several 
preparations is presented. It includes specific tests of the validity of the 
assumptions on which the assay is based, and the computations required 


ABSTRACTS 177 


appear to be simpler than those of previously published methods. A 
numerical example is given, and general methods of analysis are de- 
scribed for multiple slope-ratio assays with and without a common zero 
dose. 


185 J. M. HAMMERSLEY. The computation of covariance 
matrices. 


What is the quickest method of calculating a covariance matrix on a 
desk calculator when the results have to be self-checking? A partial 
solution is given together with some conjectures upon the complete 
solution. 


186 C. L. OAKLEY, J. W. TREVAN, AND P. A. YOUNG. The 
accuracy of the intracutaneous assay of diphtheria antitoxin. 


In the method described by Glenny and Llewellyn Jones (J. Path. 
Bact., 1931 XXXIV, 143) skin reactions are classified into four grades, 
namely, +, +, s+, s. The amount of a particular antitoxin dilution 
reducing the activity of the test dose of toxin to the smallest specific 
reaction (s+) is determined, and values assigned accordingly; simple 
interpolation is used where the actual end point reaction does not show. 
Mixtures are made up to approximately 10% intervals of antitoxin dose 
and each batch is tested in sets of four independent series of dilutions. 

In 146 sets of four results so obtained the mean log variance was 
.000278 (coefft. of variation ca. +3.9%). The variances were not dis- 
tributed in a normal manner however, and it was felt that the precision 
of the method was best assessed by the maximum variance observed. 
This corresponds to a coefficient of variation of +10%. The possibility 
of avidity and of bias in the readings each contributing to the distortion 
of the variance distribution, as well as the large dose interval employed 
relative to the error, cannot be discounted. 


187 A. F. PARKER-RHODES. Some applications of taxonomic 
distributions. 

Palaeontological evidence shows that world ecology has gone through 
alternating phases of slow and less slow change. We may infer that the 
evolution of any given group of organisms, under ecological influence, 
also has such alternating phases of ecoclasm and ecostasis. A period 
which is for one group an ecoclasm need not be one for others. For many 
groups of fungi and animals, dependent on vegetation, the chief eco- 
clasms have been associated with the major changes in the plant world; 


fe 
] 
2 
4 
| 
| 
| 
| 
4 
| 
“4 


178 BIOMETRICS, JUNE 1952 
these have been (i) invasion of the land (Silurian) (ii) spread of semi-arid 
forests (end of Carboniferous) (iii) spread of angiosperms (Cretaceous). 
At present the activities of man are producing an ecoclasm for almost 
all organisms, but this has not had time to take effect. 

An ecoclasm destroys many old niches and creates many new ones; 
some of the latter are permanent throughout the ecostasis. These are at 
first occupied by relatively ill-adapted species; by the end of the ecostasis 
they will have acquired more suitable tennants; to a first approximation 
the time when a permament niche acquires its present occupant is 
uniformly distributed throughout the ecostasis. An objective genus is 
defined as the smallest group of species descended from a common 
genarch initially adapted to a permament niche. It can be shown that 
the distribution of sizes of objective genera. at the end of an ecostasis 
should be in the logarithmic series. We can rarely tell what is a perma- 
nent niche, so that we shall often reckon as a genus what by definition 
should be two or more; conversely we shall count as separate genera 
groups of species which have never had an ancestor in a permanent niche. 
These deviations may be expected to balance, so that actual systems 
ought to exhibit the logarithmic distribution. 

Consistent ‘“lumping”’ will lead to genera whose genarchs are more 
often old than young; “splitting” will produce the reverse effect. A 
lumping system will show a distribution tending towards the geometric 
series, whereas a splitting one will tend toward the Poisson series. In 
practice the deviations must be very gross to show up. The traditional 
classification of Basidiomycetes is presented as an example of a lumping 
system; a more modern system for the same group obeys the logarithmic 
law. We can give an objective definition for a higher taxonomic cate- 
gory, namely the largest group whose gonarch appeared during or since 
the last ecoclasm:; for the fungi but not all organisms this seems to be the 
family. According to theory, the numbers of genera per family as thus 
defined should be distributed in the geometric series. This is the case 
with the Basidiomycetes in the author’s system. 

In a well-classified system the numbers of species per genur in an iso- 
lated territory, if they are wholly recruited by immigration, should be 
distributed in a manner intermediate between the Poisson series and the 
distribution in the population from which they came; if some species 
have evolved in the territory itself, the distribution will be more nearly 
logarithmic. This analysis can be applied to the flora of islands. Data 
for the Basidiomycetes of Skokholm Id. show a purely logarithmic 
distribution, significantly different from that of the source of supply; 
from this we infer that there are some species autochthonous to the 
island; the lower limit of one surviving species has been calculated. In 


ABSTRACTS 179 


fact there appears to be at least one. If the analysis is applied to the 
commoner species alone, since no species can originate in this class but 
must enter it from outside, we should expect a pure Poissonian distribu- 
tion, as if it were a population recruited solely by immigration at a 
constant rate. This is in fact observed. 


188 D.J. FINNEY. An experiment on graphical estimation. 


Of those whose work requires them to analyse data on quantal re- 
sponses to a drug or other stimulus at various levels of dose, some choose 
a probit or analogous maximum likelihood technique and perform several 
cycles of iteration before regarding their analysis as complete; others go 
to the opposite extreme and rely on a graphical-cum-nomographic 
technique, that recently proposed by Litchfield and Wilcoxon being 
typical. With the object of obtaining some evidence on the relative 
merits of these, 21 subjects were asked to draw (by eye) two parallel 
regression lines for the same set of data; the data were deliberately 
chosen as difficult for this purpose, and the subjects had no previous 
experience. Analysis showed that one cycle of probit iteration approxi- 
mated well to the maximum likelihood estimate of relative potency and 
the fiducial limits, whichever one of the 21 diagrams was used for the 
provisional lines, but that the Litchfield-Wilcoxon procedure gave only 
a rough guide to the estimate and was considerably less reliable in its 
assessment of fiducial limits. 


q 
| 
= 
= 


THE BIOMETRIC SOCIETY 


An excellent suggestion from Dr. Finney of the British Region urges 
that all members who are planning visits to other organized regions 
should notify the secretary of the region so that he may receive an invi- 
tation to any meetings which may be held during the period of his visit. 
Members who are not familiar with the names of regional secretaries 
may write to the general secretary in New Haven and he will forward 
the information to the individual concerned. This would apply currently 
to all visitors to either Eastern or Western North America, to England, 
France, Australia, Italy, Denmark and the Netherlands. The Secre- 
tary’s office would be glad to advise members of the names and current 
addresses of members in these and other areas of the world where they 
may be traveling. Later this year we hope that a new directory will 
facilitate such international exchanges among our members. 


BRITISH REGION. The British Region held its twelfth meeting at 
the Wellcome Research Institution in London on February 21, 1952. 
Following a morning business session, papers were presented during the 
afternoon by K. Mather and B. I. Hayman; Pamela M. Clarke; J. M. 
Hammersley; C. L. Oakley, J. W. Trevan and P. A. Young; A. F. Parker- 
Rhodes; and D. J. Finney. 

The thirteenth meeting of the Region was held at the Wellcome 
Research Institution in London on April 3rd and featured a paper by 
E. C. Fieller on Numerical Illustrations of Some Current Biometric 
Techniques. 

Abstracts of these papers appear elsewhere in this issue. 


REGION FRANCAISE. La derniére réunion de la Sociéte a eu le 
27 Février au Laboratoire de Zoologie de |’Ecole Normale Superieure 4 
Paris. Les communications suivantes ont été présentées: R. Feron, 
Quelques modifications 4 apporter aux hypothéses de l’analyse de la 
variance. Applications 4 l’agriculture et 4 la psychologie; A. Vessereau, 
Quelques aspects pratiques des problémes d’échantillonnage; F. Milhaud, 
Considérations sur les relations entre aptitudes. 

Ont été élus les officiers suivants: President, M. Georges Teissier; 
Secretaire Trésorier, M. Daniel Schwartz; Membre du Conseil, Mlle 
Janine Ulmo. 


180 


THE BIOMETRIC SOCIETY 181 


ENAR. The Region held a joint meeting with the Institute fot 
Mathematical Statistics at the Virginia Polytechnic Institute in Blacks- 
burg, on March 19th to 21st, with approximately 60 in attendance. At 
the opening session under the chairmanship of I. D. Wilson, M. H. 
Quenouille spoke on ‘“‘The consequences of testing significance’, with 
discussion opened by H. Fairfield Smith. The afternoon program con- 
cerned Multiple Comparisons under the chairmanship of C. F. Kossack, 
with papers hy R. E. Bechhofer, D. B. Duncan, J. W. Tukey, and R. F. 
Link and D. L. Wallace. Two sessions on March 20th were devoted to 
contributed papers, in the morning under the chairmanship of M. E. 
Terry with papers by A. R. Sen, W. D. Foster, Paul Meier, K. R. Nair 
and Paul Irick; and in the afternoon under the chairmanship of P. 8. 
Dear with papers by Lester Helms and C. F. Kossack, R. C. Bose and 
K. R. Nair, R. A. Bradley and M. E. Terry, and D. B. Duncan and 
R. C. Rhodes. Preceding the afternoon session R. J. Hader spoke on 
“Double sampling acceptance inspection on measurable quality charac- 
teristics”, with discussion opened by R. C. Bose. At the banquet on 
that evening the principal speaker was H. N. Young, the Director of 
the Virginia Agricultural Experiment Station. The concluding sessions 
on March 21st featured a paper by J. H. Curtiss on “Some chain func- 
tions useful in the Monte Carlo method” and a session on Experimental 
Design with papers by K. R. Nair and by Boyd Harshbarger, with 
Earl Houseman as chairman and the discussion opened by R. C. Bose. 

A symposium on the Design and Interpretation of Clinical Experi- 
ments with Drugs was held jointly with the American Society for 
Pharmacology and Experimental Therapeutics at the Hotel Statler in 
New York City on April 14th. Each topic was considered first by a 
clinician and then by a statistician. The reports concerned ‘A com- 
parison of modified insulins” by J. L. Izzo and 8. L. Crump, with dis- 
cussion by Alexander Marble and Paul Meier; “Clinical studies of 
analgesic drugs” by H. K. Beecher and Fred Mosteller, discussed by 
Abraham Wikler and N. B. Eddy, and the “Clinical assay of diuretic 
agents” by Harry Gold, Theodore Greiner and C. I. Bliss, discussed 
by R. A. Lehman and Donald Mainland. About 200 attended the 
symposium. 


4 
q 
3 


